Total Variation Reduction for Lossless Compression of HPC Applications

Abstract

With the growing size of high-performance computing (HPC) applications, a major challenge that domain scientists are facing is how to efficiently store and analyze the vast volume of output data. Compression can reduce the amount of data that needs to be transferred and stored. However, most of the large datasets are of floating-point format, which exhibit high entropy. As a result, existing lossless compressors usually achieve a modest reduction ratio of less than 2X. To address this problem, we propose a total variation reduction method to improve the compression ratio of lossless compressors. In particular, we first try to exploit space-filling curve (SFC), a well-known technique to preserve data locality for a multi-dimensional HPC dataset. We show and explain why a raw SFC, such as Hilbert curve and Z-order curve, cannot improve the compression ratio. Then, we explore the opportunity and theoretical feasibility of the proposed total variation reduction based algorithm. The experiment results show the effectiveness of the proposed method. The compression ratios are improved by 20.6% for FPZIP, and 18.4% for FPC, on average.

Publication
In 2021 IEEE 34th International System-on-Chip Conference (SOCC)