|
NAMEddr_lzo - Data de/compression plugin for dd_rescueSYNOPSIS-L lzo[=option[:option[:...]]]or -L /path/to/libddr_lzo.so[=option[:option[:...]]] DESCRIPTIONAboutLZO is an algorithm that de/compresses data. It is tuned for speed (especially decompression speed) and trades the size of the compressed file for it to some degree. There are variants with slow compression (yet still very fast decompression) available though. See the algorithm parameter below.This plugin has been written for dd_rescue and uses the plugin interface from it. See the dd_rescue(1) man page for more information on dd_rescue. OPTIONSOptions are passed using dd_rescue option passing syntax: The name of the plugin (lzo) is optionally followed by an equal sign (=) and options are separated by a colon (:). the lzo plugin also allows for most options to be abbreviated to five or six letters. See the EXAMPLES section below.Compression or decompressionThe lzo dd_rescue plugin (subsequently referred to as just ddr_lzo which reflects the variable parts of the filename libddr_lzo.so) choses compression or decompression mode automatically if one of the input/output files has an [lt]zo suffix; otherwise you may specify compr[ess] or decom[press] parameters on the command line.The parameter opt[imize] will tell ddr_lzo to do an optimization pass after compression. This might speed up decompression by a few percent when creating compressed data with high compression levels and large block sizes. The plugin also supports the parameter bench[mark] ; if it's specified, it will output some information about CPU usage and resulting compression or decompression bandwidth. (For small files, the numbers become meaningless due to jitter and limited time resolution -- ddr_lzo will skip the output if the numbers are very tiny.) De/compression algorithmThe lzo plugin supports a number of the (de)compression algorithms from liblzo2. You can specify which one you want to use by passing algo=XXX , where XXX can be lzo1x_1, lzo1x_1_15, lzo1x_999, lzo1x_1_11, lzo1x_1_12, lzo1y_1, lzo1y_999, lzo1f_1, lzo1f_999, lzo1b_1 ... lzo1b_9, lzo1b_99, lzo1b_999, lzo2a_999. Pass algo=help to get a list of available algorithms. Consult the liblzo documentation for more information on the algorithms. Note that only the first three are supported by lzop (it can decompress the first five though, as they're all handled by the same decompression routine).The default (lzo1x_1) is a good choice for fast compression and very fast decompression and ensures compatibility with lzop. For higher compression you might want to chose lzo1x_999, which is very slow but lzop compatible or lzo2a_999, which is twice as fast, but not compatible with lzop. DebuggingThe debug flag will cause the ddr_lzo to output information about blocks and other internal data. It's meant for debugging purposes.Finally there is also a flags=XXXX parameter. This sets the flags field in the header (default is 0x03000403) and is used for testing only. It is not sanity checked and you can easily set values that will break decompression or cause ddr_lzo to abort. Really only use for development purposes when you know meaning of the various bits. Error recoveryOn compression, when input bytes can't be read, ddr_lzo will encode holes in the compressed output file -- these will be skipped over on decompression.On decompression, erroneous blocks can be detected by the
checksums (most often) or by the decompressor. The lzo plugin tries to
continue in that case if the block header that specifies de/compressed
lengths is intact. It will then result in a block being skipped over (hole)
and the decompression will be continued with the next block. This avoids
corrupt data to end up in the output file (or preexisting, potentially good
data there being overwritten).
When the block headers are corrupt, your situation is desperate, as you will have lost the remainder of the file. To recover pieces after such a block header corruption, ddr_lzo supports the search option. With it, the plugin will search the input file (starting from the position given in dd_rescue with -s) for data that looks like a block header and if a valid looking header is found, it will start decompressing from that position. (If you can't find the data you look for, you might actually study the output generated with the debug flag.) Supported dd_rescue featuresdd_rescue supports appending to files with the -x/--extend option. If ddr_lzo is loaded and the output file is an existing .lzo file, the new data will be appended in the format specified by the existing LZOP header. If the header does not indicate a multipart (archive) file, the EOF marker will be overwritten, so that a valid .lzo file is created. Otherwise a new part will be appended.When dd_rescue can't read data or a sizable amount of zero-filled
data is found and the -a/--sparse option is active, then dd_rescue will
create sparse files (files with holes inside). This is an optimization to
save space -- the holes are interpreted as zeroes again on normal reads, so
this is transparent. The holes also can be useful to ensure that good data
is not overwritten with zeroes when data couldn't be read.
lzop compatibilityThe plugin uses the lzo1x_1 algorithm by default (just like lzop does by default) and generates adler32 checksums to allow detecting data corruption. The compressed files are compatible with lzop and ddr_lzo should handle files generated by lzop.Multipart (archive) files from lzop are decompressed to ONE output file in the order they are stored. Multipart files created by the lzo plugin to encode holes will be extracted to several files from lzop. The holes are encoded in the filenames (with a sequence number and the hole size up to 1TB; use the timestamp for huge holes), so a proper assembly of the fragments is possible even without ddr_lzo. lzop only supports the lzo1x_ family of algorithms. If you chose
another algorithm to compress data with ddr_lzo, it will set the
needed_version_to_extract field in the resulting lzop file to ddr_lzo's own
version (1.789) to indicate incompatibility with lzop (as of 1.03).
Blocksize considerationsWhen decompressing, the (soft) block size chosen in dd_rescue must be sufficient (at least half the size of the blocksize used when compressing); if you chose too small blocks, ddr_lzo will warn and exit.For compression, the chosen (soft)blocksize in dd_rescue will determine the size of blocks to be fed to the lzo??_?_compress() routines. Larger block sizes will typically result in slightly better compression ratios, though the returns on increasing the block size quickly diminish after 64k. The default from dd_rescue (128kiB) is a good choice. It is NOT recommended to increase the block size too much -- when an lzo file gets corrupted, at least one block will be lost; larger blocks result in larger damage. Also, blocks larger than 16MiB will not work well with the error tolerance features of ddr_lzo. Also note that blocks larger than 256kiB need recompilation of lzop if you want to be able to use lzop to process the .lzo files; blocks larger than 64MiB prevent decompression even with a recompiled lzop. BUGS/LIMITATIONSMaturityThe plugin is new as of dd_rescue 1.43. Do not yet rely on data saved with ddr_lzo as the only backup for valuable data. Also expect some changes to ddr_lzo in the not too distant future. (This should not break the file format, as we're following lzop ....)Compressed data is more sensitive to data corruption than plain data. Note that the checksums (adler32 or crc32) in the lzop file format do NOT allow to correct for errors; they just allow a somewhat reliable detection of data corruption. (Ideally, a 32bit checksum just misses 1 out of 2^32 corruptions; on small changes, crc32 comes a bit closer to the ideal than adler32. You may pass the crc32 option to use crc32 instead of adler32 checksums at the expense of some speed -- unfortunately the crc32 polynomial for lzop/gzip/... is not the crc32c polynomial that has hardware support on many CPUs these days.) Also note that the checksums are NOT cryptographic hashes; a malicious attacker can easily find modifications of data that do not alter the checksums. Use MD5 or better SHA-256/SHA-512 for ensuring integrity against attackers. Use par2 or similar software to create error correcting codes (Reed-Solomon / Erasure Codes) if you want to be able to recover data in face of corruption. SecurityWhile care has been applied to check the result of memory allocations ..., the decompressor code has not been audited and only limited fuzzing has been applied to ensure it's not vulnerable to malicious data -- be careful when you process data from untrusted sources.EXAMPLES
SEE ALSOdd_rescue(1) liblzo2 documentation lzop(1)AUTHORKurt Garloff <kurt@garloff.de>CREDITSThe liblzo2 library and algorithm has been written by Markus Oberhumer.http://www.oberhumer.com/opensource/lzo/ COPYRIGHTThis plugin is under the same license as dd_rescue: The GNU General Public License (GPL) v2 or v3 - at your option.HISTORYddr_lzo plugin was first introduced with dd_rescue 1.43 (May 2014).Some additional information can be found on
Visit the GSP FreeBSD Man Page Interface. |