File Compressor¶
The zfp executable in the bin
directory is primarily
intended for evaluating the rate-distortion (compression ratio and quality)
provided by the compressor, but since version 0.5.0 also allows reading and
writing compressed data sets. zfp takes as input a raw, binary
array of floats, doubles, or integers in native byte order and optionally
outputs a compressed or reconstructed array obtained after lossy compression
followed by decompression. Various statistics on compression ratio and
error are also displayed.
The uncompressed input and output files should be a flattened, contiguous sequence of scalars without any header information, generated for instance by
double* data = new double[nx * ny * nz];
// populate data
FILE* file = fopen("data.bin", "wb");
fwrite(data, sizeof(*data), nx * ny * nz, file);
fclose(file);
zfp requires a set of command-line options, the most important
being the -i
option that specifies that the input is uncompressed.
When present, -i
tells zfp to read an uncompressed input
file and compress it to memory. If desired, the compressed stream can be
written to an output file using -z
. When -i
is absent,
on the other hand, -z
names the compressed input (not output) file,
which is then decompressed. In either case, -o
can be used to
output the reconstructed array resulting from lossy compression and
decompression.
So, to compress a file, use -i file.in -z file.zfp
. To later
decompress the file, use -z file.zfp -o file.out
. A single dash
“-” can be used in place of a file name to denote standard input or output.
When reading uncompressed input, the floating-point precision (single or
double) must be specified using either -f
(float) or
-d
(double). In addition, the array dimensions must be specified
using -1
(for 1D arrays), -2
(for 2D arrays), or
-3
(for 3D arrays). For multidimensional arrays, x varies
faster than y, which in turn varies faster than z. That is, a 3D input
file corresponding to a flattened C array a[nz][ny][nx]
is
specified as -3 nx ny nz
.
Note that -2 nx ny
is not equivalent to -3 nx ny 1
, even
though the same number of values are compressed. One invokes the 2D codec,
while the other uses the 3D codec, which in this example has to pad the
input to an nx × ny × 4 array since arrays are partitioned
into blocks of dimensions 4d. Such padding usually negatively impacts
compression.
Moreover, -2 nx ny
is not equivalent to -2 ny nx
, i.e., with
the dimensions transposed. It is crucial for accuracy and compression ratio
that the array dimensions are listed in the order expected by zfp so
that the array layout is correctly interpreted. See this
discussion for more details.
Using -h
, the array dimensions and type are stored in a header of
the compressed stream so that they do not have to be specified on the command
line during decompression. The header also stores compression parameters,
which are described below. The compressor and decompressor must agree on
whether headers are used, and it is up to the user to enforce this.
zfp accepts several options for specifying how the data is to be
compressed. The most general of these, the -c
option, takes four
constraint parameters that together can be used to achieve various effects.
These constraints are:
minbits: the minimum number of bits used to represent a block
maxbits: the maximum number of bits used to represent a block
maxprec: the maximum number of bit planes encoded
minexp: the smallest bit plane number encoded
These parameters are discussed in detail in the section on
compression modes. Options -r
, -p
,
and -a
provide a simpler interface to setting all of the above
parameters by invoking
fixed-rate (-r
),
-precision (-p
), and
-accuracy (-a
).
Usage¶
Below is a description of each command-line option accepted by zfp.
General options¶
-
-h
¶
Read/write array and compression parameters from/to compressed header.
-
-q
¶
Quiet mode; suppress diagnostic output.
-
-s
¶
Evaluate and print the following error statistics:
- rmse: The root mean square error.
- nrmse: The root mean square error normalized to the range.
- maxe: The maximum absolute pointwise error.
- psnr: The peak signal to noise ratio in decibels.
Input and output¶
-
-i
<path>
¶ Name of uncompressed binary input file. Use “-” for standard input.
-
-o
<path>
¶ Name of decompressed binary output file. Use “-” for standard output. May be used with either
-i
,-z
, or both.
-
-z
<path>
¶ Name of compressed input (without
-i
) or output file (with-i
). Use “-” for standard input or output.
When -i
is specified, data is read from the corresponding
uncompressed file, compressed, and written to the compressed file
specified by -z
(when present). Without -i
,
compressed data is read from the file specified by -z
and decompressed. In either case, the reconstructed data can be
written to the file specified by -o
.
Array type and dimensions¶
-
-f
¶
Single precision (float type). Shorthand for
-t f32
.
-
-d
¶
Double precision (double type). Shorthand for
-t f64
.
-
-t
<type>
¶ Specify scalar type as one of i32, i64, f32, f64 for 32- or 64-bit integer or floating scalar type.
-
-1
<nx>
¶ Dimensions of 1D C array
a[nx]
.
-
-2
<nx> <ny>
¶ Dimensions of 2D C array
a[ny][nx]
.
-
-3
<nx> <ny> <nz>
¶ Dimensions of 3D C array
a[nz][ny][nx]
.
When -i
is used, the scalar type and array dimensions must be
specified. One of -f
, -d
, or -t
specifies
the input scalar type. -1
, -2
, or -3
specifies the array dimensions. The same parameters must be given when
decompressing data (without -i
), unless a header was stored
using -h
during compression.
Compression parameters¶
-
-r
<rate>
¶ Specify fixed rate in terms of number of compressed bits per floating-point value.
-
-p
<precision>
¶ Specify fixed precision in terms of number of uncompressed bits per value.
-
-a
<tolerance>
¶ Specify fixed accuracy in terms of absolute error tolerance.
-
-c
<minbits> <maxbits> <maxprec> <minexp>
¶ Specify expert mode parameters.
When -i
is used, the compression parameters must be specified.
The same parameters must be given when decompressing data (without
-i
), unless a header was stored using -h
when
compressing. See the section on compression modes for a
discussion of these parameters.
Examples¶
-i file
: read uncompressed file and compress to memory-z file
: read compressed file and decompress to memory-i ifile -z zfile
: read uncompressed ifile, write compressed zfile-z zfile -o ofile
: read compressed zfile, write decompressed ofile-i ifile -o ofile
: read ifile, compress, decompress, write ofile-i file -s
: read uncompressed file, compress to memory, print stats-i - -o - -s
: read stdin, compress, decompress, write stdout, print stats-f -3 100 100 100 -r 16
: 2x fixed-rate compression of 100 × 100 × 100 floats-d -1 1000000 -r 32
: 2x fixed-rate compression of 1,000,000 doubles-d -2 1000 1000 -p 32
: 32-bit precision compression of 1000 × 1000 doubles-d -1 1000000 -a 1e-9
: compression of 1,000,000 doubles with < 10-9 max error-d -1 1000000 -c 64 64 0 -1074
: 4x fixed-rate compression of 1,000,000 doubles