|
|
| |
dd_rescue(1) |
Data recovery and protection tool |
dd_rescue(1) |
dd_rescue - Data recovery and protection tool
dd_rescue [options] infile outfile
dd_rescue [options] [-2/-3/-4/-z/-Z seed/seedfile] outfile
dd_rescue [options] [--shred2/--shred3/--shred4/--random/--frandom
seed/seedfile] outfile
dd_rescue is a tool that copies data from a source (file, block device,
pipe, ...) to one (or several) output file(s).
If input and output files are seekable (block devices or regular
files), dd_rescue does copy with large blocks (softbs) to increase
performance. When a read error is encountered, dd_rescue falls back
to reading smaller blocks (hardbs), to allow to recover the maximum amount
of data. If blocks can still not be read, dd_rescue by default skips
over them also in the output file, avoiding to overwrite data that might
have been copied there successfully in a previous run. (Option -A /
--alwayswrite changes this.).
dd_rescue can copy in reverse direction as well, allowing
to approach a bad spot from both directions. As trying to read over a bad
spot of significant size can take very long (and potentially cause further
damage), this is an important optimization when recovering data. The
dd_rhelp tool takes advantage of this and automates data recovery.
dd_rescue does not (by default) truncate the output file.
dd_rescue by default reports on progress, and optionally
also writes into a logfile. It has a progress bar and gives an estimate for
the remaining time. dd_rescue has a wealth of options that influence
its behavior, such as the possibility to use direct IO for input/output, to
use fallocate() to preallocate space for the output file, using splice copy
(in kernel zerocopy) for efficiency, looking for empty blocks to create
sparse files, or using a pseudo random number generator (PRNG) to quickly
overwrite data with random numbers.
The modes to overwrite partitions or files with pseudo random
numbers make dd_rescue a tool that can be used for secure data
deletion and thus not just a data recovery and backup tool but also a data
protection tool.
You can use "-" as infile or outfile, meaning stdin or
stdout. Note that this means that either file is not seekable, limiting the
usefulness of some of dd_rescues features.
When parsing numbers, dd_rescue assumes bytes. It accepts the following
suffixes:
b -- 512 size units (blocks)
k -- 1024 size units (binary kilobytes, kiB)
M -- 1024^2 size units (binary megabytes, MiB)
G -- 1024^3 size units (binary gigabytes, GiB)
The following options may be used to modify the behavior of
dd_rescue .
- -h, --help
- This option tells dd_rescue to output a list of options and
exit.
- -V, --version
- Display version number and exit.
- -q, --quiet
- tells dd_rescue to be less verbose.
- -v, --verbose
- makes dd_rescue more verbose.
- -c 0/1, --color=0/1
- controls whether dd_rescue uses colors. By default it does, unless
the terminal type from TERM is unknown or dumb or ends in -m or
-mono.
- -f, --force
- makes dd_rescue skip some sanity checks (e.g. automatically setting
reverse direction when input and output file are the same and ipos <
opos).
- -i, --interactive
- tells dd_rescue to ask before overwriting existing files.
- -b softbs, --softbs=softbs, --bs=softbs
- sets the (larger) block size to softbs bytes. dd_rescue will
transfer chunks of that size unless a read error is encountered (or the
end of the input file or the maximum transfer size has been reached). The
default value for this is 64k for buffered I/O and 1M for direct I/O.
- -B hardbs, --hardbs=hardbs, --block-size=hardbs
- sets the (smaller) fallback block size to hardbs bytes. When
dd_rescue encounters read errors, it will fall back to copying data
in chunks of this size. This value defaults to 4k for buffered I/O and 512
bytes for direct I/O.
hardbs should be equal to or smaller than softbs. If both
block sizes are identical, no fallback mechanism (and thus no retry) will
take place on read errors.
- -y syncsize, --syncfreq=syncsize
- tells dd_rescue to call fsync() on the output file every
syncsize bytes (will be rounded to multiples of softbs sized
blocks). It will also update the progress indicator at least as often. By
default, syncsize is set to 0, meaning that fsync() is only issued
at the end of the copy operation.
- -s ipos, --ipos=ipos, --input-position=ipos
- sets the starting position of the infile to ipos. Note that
ipos is specified in bytes (but suffixes can be used, see above), not in
terms of softbs or hardbs sized blocks. The default value
for this is 0. When reverse direction copy is used, an ipos of 0 is
treated specially, meaning the end of file.
Negative positions result in an error message.
- -S opos, --opos=opos, --output-position=opos
- sets the starting position of the outfile to opos. If not
specified, opos is set to ipos, so the file offsets in input
and output file are the same. For reverse direction copy, an explicit
opos of 0 will position at the end of the output file.
- -x, --extend, --append
- changes the interpretation of the output position to start at the end of
the existing output file, making appending to a file convenient. If the
output file does not exist, an error will be reported and dd_rescue
aborted.
- -m maxxfer, --maxxfer=maxxfer, --max-size=maxxfer
- specifies the maximum number of bytes (suffixes apply, but it's NOT
counted in blocks) that dd_rescue copies. If EOF is encountered
before maxxfer bytes have been transferred, this option will be
silently ignored.
- -M, --noextend
- tells dd_rescue to not extend the output file. This option is
particularly helpful when overwriting a file with random data or zeroes
for safe data destruction. If the output file does not exist, an error
message will be generated and the program be aborted.
- -e maxerr, --maxerr=maxerr
- tells dd_rescue to exit, after maxerr read errors have been
encountered. By default, this is set to 0, resulting in dd_rescue
trying to move on until it hits EOF (or maxxfer bytes have been
transferred).
- -w, --abort_we
- makes dd_rescue abort on any write errors. By default, on reported
write errors, dd_rescue tries to rewrite the blocks with small
block size writes, so a small failure in a larger block will not cause the
whole block not to be written. Note that this may be handled similarly by
your Operating System kernel with buffered writes without the user or
dd_rescue noticing; the write retry logic in dd_rescue is mostly useful
for direct I/O writes where write errors can be reliably detected.
Write error detection with buffered writes is unreliable; the kernel reports
success and traces of the failing writeback operations later may only
appear in your syslog. dd_rescue does try to notice the user by calling
fsync() and carefully checking the return values of fsync() and close()
calls.
Note that dd_rescue does exit if writes to the output file result in
the Operating System reporting that no space is left.
- -A, --alwayswrite
- changes the behavior of dd_rescue to write zeroes to the output
file when the input file could not be read. By default, it just skips
over, leaving whatever content was in the output file at the file position
before. The default behavior may be desired, if e.g. previous copy
operations may have resulted in good data being in place; it may be
undesired if the output file may contain garbage (or sensitive
information) that should rather be overwritten with zeroes.
- -a, --sparse
- will make dd_rescue look for empty blocks (of at least half of
softbs size), i.e. blocks filled with zeroes. Rather than writing
those zeroes to the output file, it will then skip forward in the output
file, resulting in a sparse file, saving space in the output file system
(if it supports sparse files). Note that if the output file does already
exist and already has data stored at the location where zeroes are skipped
over, this will result in an incomplete copy in that the output file is
different from the input file at the location where blocks of zeroes were
skipped over. dd_rescue tries to detect this and issue a warning,
but it does not prevent this from happening
- -W, --avoidwrite
- results in dd_rescue reading a block ( softbs sized) from
the output file prior to writing it. If it is already identical with the
data that would be written to it, the writes are actually avoided. This
option may be useful for devices, where e.g. writes should be avoided
(e.g. because they may impact the remaining lifetime or because they are
very slow compared to reads).
- -R, --repeat
- tells dd_rescue to only read one block ( softbs sized) and
then repeatedly write it to the output file. Note that this results in
never hitting EOF on the input file and should be used with a limit for
the transfer size (options -m or -M) or when filling up an output device
completely.
This option is automatically set, if the input file name equals
"/dev/zero".
- -u, --rmvtrim
- instructs dd_rescue to remove the output file after writing to it
has completed and issue a FITRIM on the file system that contains the
output file. This makes only sense if writing zeros (or random numbers) as
opposed to useful content from another file. (dd_rescue will ask for
confirmation if this is specified with a normal input file and no -f
(--force) is used.) This option may be used to ensure that all empty
blocks of a file system are filled with zeros (rather than containing
fragments of deleted files with possibly sensitive information).
The FITRIM ioctl (on Linux) tells the flash storage to consider the freed
space as unused (like the fstrim tool or the discard option) by issuing
ATA TRIM commands. This will only succeed with superuser privileges (but
the error can otherwise be safely ignored). This is useful to ensure full
performance of flash memory / SSDs. Note that FITRIM can take a while on
large file systems, especially if the file systems are not mounted with
the discard option and have not been trimmed (with e.g. fstrim) for a
while. Not all file systems and not all flash-based storage support
this.
- -k, --splice
- tells dd_rescue to use the Linux in-kernel zerocopy splice() copy
operation rather than reading blocks into a user space buffer. Note that
this operation mode does prevent the support of a number of
dd_rescue features that can normally be used, such as falling back
to smaller block sizes, avoiding writes, sparse mode, repeat optimization,
reverse direction copy. A warning is issued to make the user aware.
- -P, --fallocate
- results in dd_rescue calling fallocate() on the output file,
telling the file system how much space to preallocate for the output file.
(The size is determined by the expected last position, as inferred from
the input file length and maxxfer ). On file systems that support
it, this results in them making better allocation decisions, avoiding
fragmentation. (Note that it does not make sense to use sparse together
with fallocate().)
This option is only available if dd_rescue is compiled with fallocate()
support. For optimal support, it should be compiled with the libfallocate
library.
- -C rate, --ratecontrol=rate
- limits the transfer speed of dd_rescue to the rate (per
second). The usual suffixes are allowed. Note that this limits the average
speed; the current speed may be up to twice this limit. Default is
unlimited. Note that you will have to use smaller softblocksizes if you
want to go below 32k (kB/s).
- -r, --reverse
- tells dd_rescue to copy in reverse direction, starting at
ipos (with special case 0 meaning EOF) and working towards the
beginning of the file. This is especially helpful if the input file has a
bad spot which can be extremely slow to skip over, so approaching it from
both directions saves a lot of time (and may prevent further damage).
Note that dd_rescue does automatically switch to reverse direction
copy, if input and output file are identical and the input position is
smaller than the output position, similar to the intelligence that
memmove() uses to prevent loss of data when overlapping areas are copied.
The option -f / --force does prevent this intelligence from
happening.
- -p, --preserve
- When copying files, this option does result in file metadata (timestamps,
ownership, access rights, xattrs) to be copied, similar to the option with
the same name in the cp program.
Note that ACLs and xattrs will only be copied if dd_rescue has been
compiled with libxattr support and the library can be dynamically loaded
on the system. Also note that failing to copy the attributes with
-p is not considered a failure and thus won't negatively affect the
exit code of dd_rescue.
- -t, --truncate
- tells dd_rescue to open the output file with O_TRUNC, resulting in
the output file (if it is a regular file) to be truncated to 0 bytes
before writing to it, removing all previous content that the file may have
contained. By default, dd_rescue does not remove previous
content.
- -T, --trunclast
- tells dd_rescue to truncate the output file to the highest copied
position after the copy operation completed, thus ensuring there's no data
beyond the end of the data that has been copied in this run.
- -d, --odir_in
- instructs dd_rescue to open infile with O_DIRECT, bypassing
the kernel buffers. While this option has a negative effect on performance
(the kernel does read-ahead for buffered I/O), it will result in errors to
be detected more quickly (kernel won't retry) and allows for smaller I/O
units (hardware sector size, 512bytes for most hard disks).
O_DIRECT may not be available on all platforms.
- -D, --odir_out
- tells dd_rescue to open outfile with O_DIRECT, bypassing
kernel buffers. This has a significant negative effect on performance, as
the program needs to wait for writes to hit the disks as opposed to the
asynchronous nature of buffered writeback. On the flip side, the return
status from writing is reliable this way and smaller I/O chunks (hardware
sector size, 512bytes) are possible.
- -l logfile, --logfile=logfile
- Unless in quiet mode, dd_rescue does produce constant updates on
the status of the copy operation to stderr. With this option, these
updates are also written to the specified logfile. The control
characters (to move the cursor up to overwrite the existing status lines)
are not written to the logfile.
- -o bbfile, --bbfile=bbfile
- instructs dd_rescue to write a list of bad blocks to bbfile.
The file will contain a list of numbers (ASCII), one per line, where the
numbers indicate the offset in terms of hardbs sized blocks. The
file format is compatible with that of badblocks. Using dd_rescue on a
block device (partition) and setting hardbs to the block size of a
file system that you want to create, you should be able to feed the
bbfile to mke2fs with the option -l.
- -Y ofileX, --outfile=ofileX, --of=ofileX
- If you want to copy data to multiple files simultaneously, you can specify
this option. It can be specified multiple times, so many copies can be
made. Note that these files are secondary output files; they share file
position with the primary output file outfile. Errors when writing
to a secondary output file are ignored.
- -z RANDSEED, --random=RANDSEED
- -Z RANDSEED, --frandom=RANDSEED
- -2 RANDSEED, --shred2=RANDSEED
- -3 RANDSEED, --shred3=RANDSEED
- -4 RANDSEED, --shred4=RANDSEED
- When you want to overwrite a file, partition or disk with random data,
using /dev/urandom (on Linux) as input is not a very good idea; the
interface has not been designed to yield a high bandwidth. It's better to
use a user space Pseudo Random Number Generator (PRNG). With option -z /
--random, the C library's PRNG is used. With -Z / --frandom and the
-2/-3/-4 / --shred2/3/4 options, an RC4 based PRNG is used.
Note that in this mode, there is no infile so the first non-option
argument is the output file.
The PRNG needs seeding; the C libraries PRNG takes a 32bit integer (4
bytes); the RC4 based PRNG takes 256 bytes. If RANDSEED is an
integer, the integer number will be used to seed the C library's PRNG. For
the RC4 method, the C library's PRNG then generates the 256 bytes to seed
it. This creates repeatable PRNG data. The RANDSEED value of 0 is special;
it will create a seedval that's based on the current time and the process'
PID and should be different for multiple runs of dd_rescue .
If RANDSEED is not an integer, it's assumed to be a file name from
which the seed values can be read. dd_rescue will read 4 or 256
bytes from the file to seed the C library's or the RC4 PRNG. For good
pseudo random numbers, using /dev/urandom to seed is a good idea.
The modes -2/-3/-4 resp. --shred2/--shred3/--shred4 will overwrite the
output file multiple times; after each pass, fsync() will ensure that the
data does indeed hit the file. The last pass for these modes will
overwrite the file with zeroes. The rationale behind doing this is to make
it easier to hide that important data may have been overwritten, to make
it easier for intelligent storage systems (such as SSDs) to recycle the
empty blocks and to allow for better compression of a file system image
containing such data.
With -2 / --shred2, one pass with RC4 generated PRNG is happening and then
zeroes are written. With -3 / --shred3, there are two passes with RC4 PRNG
generated random numbers and a zero pass; the second PRNG pass writes the
inverse (bit-wise reversed) numbers from the first pass. -4 / --shred4
works like -3 / --shred3, with an additional pass with independent random
numbers as third pass.
Since version 1.42, dd_rescue has an interface for plugins. Plugins have
the ability to analyze the copied data or to transform it prior to it being
written.
- -L plugin1[=param1[:param2[:..]]][,plugin2[=..][,..]]
- --plugins=plugin1[=param1[:param2[:..]]][,plugin2[=..][,..]]
- loads plugins plugin1 ... and passes parameters to it. All plugins should
support at least the help parameter and provide information on their
usage.
Plugins may impose limits on dd_rescue. Plugins that look at the data can't
work with splice, as this avoids copying data to user space. Also the
interface currently does not facilitate reverse direction copy. Some
plugins may impose further restrictions w.r.t. alignment of data in the
file or not using sparse detection.
See section PLUGINS for an overview of available plugins.
The null plugin (ddr_null) does nothing, except if you specify the
[no]lnchange or the [no]change options in which case the plugin
indicates to others that it transforms the length of the output or the data of
the stream. (With the no prefix, it's reset to the default no-change
indication again.) This may be helpful for testing or to influence which file
the hash plugin considers for reading/writing extended attributes from/to and
for plugins to change their behavior with respect to hole detection.
ddr_null_ddr also allows you to specify debug in which case it just
reports the blocks that it passes on.
When the hash plugin (subsequently referred to as ddr_hash) is loaded, it will
calculate a cryptographic hash and optionally also a HMAC over the copied data
and print the result at the end of the copy operations. The hash algorithm can
be chosen by specifying alg[o[rithm]]=ALG where ALG is one of md5,
sha1, sha256, sha224, sha512, sha384. (Specify alg=help to get a list.) To
abbreviate the syntax, the alg= piece can be omitted.
For backwards compatibility, the hash plugin can also be referred to with the
old MD5 name; it then defaults to the md5 algorithm.
The computed value should be identical to calling md5sum/sha256sum/... on the
target file (unless you only write part of the file), but saves time by not
accessing the (possibly large) file a second time. The hash plugin handles
sparse writes and arbitrary offsets fine.
multipart=CHUNKSIZE tells ddr_hash to calculate multiple
checksums for file chunks of CHUNKSIZE each and then combine them into a
combined checksum by creating a checksum over the piece checksums. This is
how the checksum for S3 multipart objects is calculated (using the md5
hash); the output there is the combination checksum with a dash and the
number of parts appended.
Note that this feature is new in 1.99.6 and does not yet handle situations
cleanly, where offsets plus block sizes do not happen to cleanly align with
the CHUNKSIZE. The implementation for this will be completed later. Other
features like the append/prepend/hmac pieces also don't work well with
multipart checksum calculation.
ddr_hash also supports the parameter append=STRING which
appends the given STRING to the output before computing the cryptographic
hash. Treating the STRING as a shared secret, this can actually be used to
protect against someone not knowing the secret altering the contents (and
recomputing the hash) without anyone noticing. It's thus a cheap way of a
cryptographic signature (but with preshared secrets as opposed to public key
cryptography). Use HMAC for a somewhat better way to sign data with a shared
secret.
ddr_hash also supports prepend=STRING which is likely harder to attack
with brute force than an appended string. Note that ddr_hash always prepends
multiples of the hash algorithm's block size and pads the STRING with 0 to
match.
ddr_hash can be used to compute a HMAC (Hash-based Message
Authentication Code) instead of the plain hash. The HMAC uses a password
that's prepended and transformed twice to the data which is then hashed
twice. HMAC is believed to protect somewhat better against extension or
collision attacks than a plain hash (with a plain prepended secret), so it's
a better way to authenticate data with a shared secret. (You can use
append/prepend in addition to HMAC, if you have a need for a scheme with
more than one secret.)
When HMAC is enabled with one of the following parameters, both the plain hash
and the HMAC are computed by ddr_hash. Both are output to the console/log,
but the HMAC is used instead of the hash value to be written to a CHECKSUMS
file or to an extended attribute or checked against (see below).
hmacpwd=STRING sets the shared secret (password) for computing the
HMAC. Passing the secret on the command line has the disadvantage that the
shell may mistreat some bytes as special characters and that the command
line may be visible to all logged in users on the system.
hmacpwdfd=INT sets a file descriptor from with the secret (password)
for HMAC computation will be read. Specifying 0 means standard input, in
which case ddr_hash even prints a prompt for you ... Other numbers may be
useful if dd_rescue is called from another program that opens a pipe to pass
the secret. hmacpwdnm=INNAME sets a file from which the shared secret
(password) is read. Note that all bytes (up to 2048 of them) are read and
used, including trailing white space, 0-bytes or newlines.
Please note that the ddr_hash plugin at this point does NOT take a lot of care
to prevent the password/pre/appended secret from remaining in memory or
leaking into a swap/page file. (This will be improved once I look into
encryption plugins.)
ddr_hash accepts the parameter output , which will cause
ddr_hash to output the cryptographic hash to stdout in the same format that
md5sum/sha256sum/... use. You can also specify outfd=INT to have the
plugin write the hash to a different file descriptor specified by the
integer number INT. Note that ddr_hash always processes data in binary mode
and correctly indicates this with a star (*) in the output generated with
output/outfd=.
The checksum can also be written to a file by giving the outnm=OUTNAME
parameter. Then a file with OUTNAME will be created and a
md5sum/sha256sum/... compatible line will be printed to the file. If the
file exists and contains an entry for the file, it will be updated. If the
file exists and does not contain an entry for the file, one will be
appended. If OUTNAME is omitted, the file name CHECKSUMS.alg (or HMACS.alg
if HMAC is enabled) will be used (alg is replaced by the chosen algorithm).
If the checksum can't be written, a warning will be printed and the exit
code of dd_rescue will become non-zero.
The checksum can be validated using chknm=CHKNAME . The
file will be read and ddr_hash will look for an md5sum/sha256sum/...
compatible line with a matching file name to take the checksum from and
compare it to the one computed. If NAME is omitted, the same default as
described above (in outnm=...) will be used. You can also read the checksum
from stdin if you prefer by specifying the check option.
Note that in any case, the check is only performed after the copy operation is
completed -- a faulty checksum will thus NOT result in the copy not taking
place. However, the exit code of dd_rescue will indicate the error. (If you
want to avoid copying data with a broken checksum into the final target, use
a temporary target that you delete upon error and only move to the final
location if dd_rescue's exit value is 0; you can of course also copy to
/dev/null for testing beforehand, but it might be too costly reading the
input file twice.)
If in addition to chknm (or chk_xattr ) the option chkadd
is specified, then a missing checksum will not be reported as error, but
instead an entry to the checksum file (or xattr) be added. A mismatch will
still be reported as error and the checksum file will not be updated.
You can store the cryptographic hash into the files by using the
set_xattr option. The hash will be stored into the extended attribute
user.checksum.ALG by default (user.hmac.ALG if HMAC is enabled), but you can
override the name of the attribute by specifying set_xattr=XATTR.NAME
instead. If the xattr can't be written, an error will be reported, unless
you also specify the fallb[ack][=CHKNAME] option. In that case,
ddr_hash tries to write the checksum to the CHKNAME checksums file. (For the
default for CHKNAME, see outnm= option above.)
chk_xattr will validate that the computed hash matches the one read
from the extended attribute. The same default attribute name applies and you
can likewise override it with chk_xattr=XATTR.NAME . A missing
attribute is considered an error (although the same fallback is tried if you
specify the fallback option). A broken checksum is of course considered an
error as well, but just like with checknm=CHKNAME won't prevent the copy.
See the discussion there.
Note that for output,outfd,outnm=,set_xattr ddr_hash will use the
output file name to attach the checksum to (be it by setting xattr or the
file name used in the checksum file), unless a plugin in the chain after
ddr_hash indicates that it changes the data. In that case, it will warn and
associate the checksum with the input file name, unless there's another
plugin before ddr_hash in the chain which indicates data transformation as
well. In that case, there is no file that the checksum could be associated
with and ddr_hash will report an error.
Likewise for chknm=,check,chk_xattr ddr_hash will use the input file name to
get the checksum (be it by reading the xattr or by looking for the input
file name in a checksums file) unless there's a plugin in the chain before
ddr_hash that indicates that it changes the data. The output file name will
then be used, unless there's another plugin after ddr_hash indicating data
change as well, in which case there's no file we could get the checksum for
and thus an error is reported.
If your system supports extended attributes, those have the
advantage of traveling with the files; thus a rename or copy (with dd_rescue
-p) will maintain the checksum. Checksum files on the other hand can be
handled everywhere (including the transfer via ftp or http) and can be
cryptographically signed with PGP/GnuPG.
Please note that the md5 algorithm is NOT recommended any more for
good protection against malicious attempts to hide data modification; it's
not considered strong enough any more to prevent hash collisions. sha1 is a
bit better, but has been broken as well as of 2017. The recommendation is to
use the SHA-2 family of hashes. On 32bit machines, I'd recommend sha256,
while on 64bit machines, sha512 is faster and thus the best choice.
ddr_hash also supports using the HMAC code and hashes for deriving
keys from passwords using the PKCS5 PBKDF2 (password-based key derivation
function) that allows you to improve the protection from mediocre passwords
by using a salt and a relatively expensive key stretching operation. This is
only meant for testing and may be removed in the future. It's thus not
documented in this man page. See the built-in help function for a brief
summary on the usage.
The lzo plugin allows to compress and decompress data using liblzo2. lzo is an
algorithm that is faster than most other algorithms but does not compress as
well. See the ddr_lzo(1) man page for more details.
The crypt plugin allows to encrypt and decrypt data on the fly. It currently
supports a variety of AES ciphers. See the ddr_crypt(1) man page for
more details.
On successful completion, dd_rescue returns an exit code of 0. Any other
exit code indicates that the program has aborted because of an error condition
or that copying of the data has not been entirely successful.
- dd_rescue -k -P -p -t infile outfile
- copies infile to outfile and does truncate the output file
on opening (so deleting any previous data in it), copies mode, times,
ownership at the end, uses fallocate to reserve the space for the output
file and uses efficient in kernel splice copy method.
- dd_rescue -A -d -D -b 512 /dev/sda /dev/sda
- reads the contents of every sector of disk sda and writes it back to the
same location. Typical hard disks reallocate flaky and faulty sectors on
writes, so this operation may result in the complete disk being usable
again when there were errors before. Unreadable blocks however will
contain zeroes after this.
- dd_rescue -2 /dev/urandom -M outfile
- overwrites the file outfile twice; once with good pseudo random
numbers and then with zeroes.
- dd_rescue -t -a image1.raw image2.raw
- copies a file system image and looks for empty blocks to create a sparse
output file to save disk space. (If the source file system has been used a
bit, on that file system creating a large file with zeroes and removing it
again prior to this operation will result in more sectors with zeroes.
dd_rescue -u /dev/zero DUMMY will achieve this
...)
- dd_rescue -ATL hash=md5:output,lzo=compress:bench,MD5:output in out.lzo
- copies the file in to out.lzo with using lzo (lzo1x_1)
compression and calculating an md5 hash (checksum) on both files. The md5
hashes for both are also written to stdout in the md5sum output format.
Note that the compress parameter to lzo is not strictly required here; the
plugin could have deduced it from the file names. This example shows that
you can specify multiple plugins with multiple parameters; the plugins are
forming a filter chain. You can specify the same plugin multiple
times.
- dd_rescue -L hash=sha512:set_xattr:fallb,null=change infile /dev/null
- reads the file infile and computes its sha512 hash. It stores it in
the input file's user.checksum.sha512 attribute (and falls back to writing
it to CHECKSUMS.sha512 if xattrs can't be written). Note the use of the
null plugin with faking data change with the change parameter; this causes
the hash plugin to write to the input file which it would not normally
have done. Of course this will fail if you don't have the appropriate
privileges to write xattrs to infile nor to write the checksum to
CHECKSUMS.sha512.
See also README.dd_rescue and ddr_lzo(1) to learn about the
possibilities.
Untested code is buggy, almost always. I happen to have a damaged hard disk that
I use for testing dd_rescue from time to time. But to allow for automated
testing of error recovery, it's better to have predictable failures for the
program to deal with. So there is a fault injection framework.
Specifying -F 5w/1,17r/3,42r/-1,80-84r/0 on the command-line will
result in in the 5th block (counted in hardblocksize) will fail to be written
once (from which dd_rescue should recover, as it tries a second time for
failed writes), block no 17 will fail to be read 3 times, block no 42 will
read fine once, but then fail afterwards, whereas blocks 80 through 83 are
completely unreadable (will fail infinite times). Note that the range excludes
the last block (80-84 means 4 blocks starting @ 80).
Block offsets are always counted in absolute positions, so starting in the
middle of a file with -s or reverse copying won't affect the absolute position
that is hit with the fault injection. (This has changed since 1.98.)
The source code does use the 64bit functions provided by glibc for file
positioning. However, your kernel might not support it, so you might be unable
to copy partitions larger then 2GB into a file.
This program has been written using Linux and only tested on a couple of Linux
systems. People have reported to have successfully used it on other Un*xish
systems (such as xBSD or M*cOS), but these systems get little regular test
coverage; so please be advised to test properly (possibly using the make check
test suite included with the source distribution) before relying on dd_rescue
on non Linux based systems.
Currently, the escape sequence for moving the cursor up is hard coded in the
sources. It's fine for most terminal emulations (including vt100 and linux),
but it should use the terminal description database instead.
Since dd_rescue-1.10, non-seekable input or output files are supported, but
there's of course limitations to recover errors in such cases.
dd_rescue does not automate the recovery of faulty files or
partitions by automatically keeping a list of copied sectors and approaching
bad spots from both sides. There is a helper script dd_rhelp from LAB
Valentin that does this. Integration of such a mode into dd_rescue
itself is non-trivial and due to the complexity of the source code might not
happen.
There also is a tool, GNU ddrescue, that is a reimplementation of this tool
and which contains the capabilities to automate recovery of bad files in the
way dd_rhelp does. It does not have the feature richness of dd_rescue, but
is reported to be easier to operate for error recovery than dd_rescue with
dd_rhelp.
If your data is very valuable and you are considering sending your
disk to a data recovery company, you might be better off NOT trying to use
imaging tools like dd_rescue, dd_rhelp or GNU ddrescue. If you're unlucky,
the disk has suffered some mechanical damage (e.g. by having been dropped),
and continuing to use it may make the head damage the surface further. You
may be able to detect this condition by quickly raising error counts in the
SMART attributes or by a clicking noise.
Please report bugs to me via email.
The modes for overwriting data with pseudo random numbers to securely delete
sensitive data on purpose only implement a limited number of overwrites. While
Peter Gutmann's classic analysis concludes that the then current hard disk
technology requires more overwrites to be really secure, the author believes
that modern hard disk technology does not allow data restoration of sectors
that have been overwritten with the --shred4 mode. This is in compliance with
the recommendations from BSI GSDS M7.15.
Overwriting whole partitions or disks with random numbers is a fairly safe way
to destroy data, unless the underlying storage device does too much magic.
SSDs are doing fancy stuff in their Flash Translation Layer (FTL), so this
tool might be insufficient to get rid of data. Use SECURITY_ERASE (use hdparm)
there or -- if available -- encrypt data with AES256 and safely destroy the
key. Normal hard disks have a small risk of leaking a few sectors due to
reallocation of flaky sectors.
For securely destroying single files, your mileage may vary. The more advanced
your file system, the less likely dd_rescue's destruction will be effective.
In particular, journaling file systems may carry old data in the journal. File
systems that do copy-on-write (COW) such as btrfs, are very likely to have old
copies of your supposedly erased file. It might help somewhat to fill the file
systems with zeros (dd_rescue -u /dev/zero /path/to/fs/DUMMYNAME) to force the
file system to release and overwrite non-current data after overwriting
critical files with random numbers. If you can, better destroy a whole
partition or disk.
README.dd_rescue README.dd_rhelp ddr_lzo(1)
wipe(1) shred(1) ddrescue(1) dd(1)
Kurt Garloff <kurt@garloff.de>
Many little issues were reported by Valentin LAB, the author of dd_rhelp
.
The RC4 PRNG (frandom) is a port from Eli Billauer's kernel mode PRNG.
A number of recent ideas and suggestions came from Thomas.
This program is protected by the GNU General Public License (GPL) v2 or v3 - at
your option.
Since version 1.10, non seekable input and output files are supported.
Splice copy -k is supported since 1.15.
A progress bar exists since 1.17.
Support for preallocation (fallocate) -P exists since 1.19.
Since 1.23, we default to -y0, enhancing performance.
The Pseudo Random Number modes have been started with 1.29.
Write avoidance -W has been implemented in 1.30
Multiple output files -Y have been added in 1.32.
Long options and man page came with 1.33.
Optimized sparse detection (SSE2, armv6, armv8 asm, AVX2) has been present since
1.35 and been enhanced until 1.43.
We support copying extended attributes since 1.40 using libxattr.
Removing and (fs)trimming the output file's file system exists since 1.41.
Support for compilation with bionic (Android's C library) with most features
enabled also came with 1.41.
Plugins exist since 1.42, the MD5 plugin came with 1.42, the lzo plugin with
1.43. 1.44 renamed the MD5 plugin to hash and added support for the SHA-2
family of hashes. 1.45 added SHA-1 and the ability to store and validate
checksums.
1.98 brought encryption and the fault injection framework, 1.99 support for
ARMv8 crypto acceleration. 1.99.5 brought ratecontrol. 1.99.6 brought S3 style
multipart checksums.
Some additional information can be found on
http://garloff.de/kurt/linux/ddrescue/
LAB Valentin's dd_rhelp can be found on
http://www.kalysto.org/utilities/dd_rhelp/index.en.html
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |