|
|
| |
JDUPES(1) |
FreeBSD General Commands Manual |
JDUPES(1) |
jdupes - finds and performs actions upon duplicate files
jdupes [ options ] DIRECTORIES ...
Searches the given path(s) for duplicate files. Such files are found by
comparing file sizes, then partial and full file hashes, followed by a
byte-by-byte comparison. The default behavior with no other "action
options" specified (delete, summarize, link, dedupe, etc.) is to print
sets of matching files.
- -@ --loud
- output annoying low-level debug info while running
- -0 --printnull
- when printing matches, use null bytes instead of CR/LF bytes, just like
'find -print0' does. This has no effect with any action mode other than
the default "print matches" (delete, link, etc. will still print
normal line endings in the output.)
- -1 --one-file-system
- do not match files that are on different filesystems or devices
- -A --nohidden
- exclude hidden files from consideration
- -B --dedupe
- issue the btrfs same-extents ioctl to trigger a deduplication on disk. The
program must be built with btrfs support for this option to be
available
- -C --chunksize=BYTES
- set the I/O chunk size manually; larger values may improve performance on
rotating media by reducing the number of head seeks required, but also
increases memory usage and can reduce performance in some cases
- -D --debug
- if this feature is compiled in, show debugging statistics and info at the
end of program execution
- -d --delete
- prompt user for files to preserve, deleting all others (see CAVEATS
below)
- -f --omitfirst
- omit the first file in each set of matches
- -H --hardlinks
- normally, when two or more files point to the same disk area they are
treated as non-duplicates; this option will change this behavior
- -h --help
- displays help
- -i --reverse
- reverse (invert) the sort order of matches
- -I --isolate
- isolate each command-line parameter from one another; only match if the
files are under different parameter specifications
- -L --linkhard
- replace all duplicate files with hardlinks to the first file in each set
of duplicates
- -m --summarize
- summarize duplicate file information
- -M --printwithsummary
- print matches and summarize the duplicate file information at the end
- -N --noprompt
- when used together with --delete, preserve the first file in each set of
duplicates and delete the others without prompting the user
- -n --noempty
- exclude zero-length files from consideration; this option is the default
behavior and does nothing (also see -z/--zeromatch)
- -O --paramorder
- parameter order preservation is more important than the chosen sort; this
is particularly useful with the -N option to ensure that automatic
deletion behaves in a controllable way
- -o --order=WORD
- order files according to WORD: time - sort by modification time name -
sort by filename (default)
- -p --permissions
- don't consider files with different owner/group or permission bits as
duplicates
- -P --print=type
- print extra information to stdout; valid options are: early - matches that
pass early size/permission/link/etc. checks partial - files whose partial
hashes match fullhash - files whose full hashes match
- -Q --quick
- [WARNING: RISK OF DATA LOSS, SEE CAVEATS] skip byte-for-byte
verification of duplicate pairs (use hashes only)
- -q --quiet
- hide progress indicator
- -R --recurse:
- for each directory given after this option follow subdirectories
encountered within (note the ':' at the end of option; see the Examples
section below for further explanation)
- -r --recurse
- for every directory given follow subdirectories encountered within
- -l --linksoft
- replace all duplicate files with symlinks to the first file in each set of
duplicates
- -S --size
- show size of duplicate files
- -s --symlinks
- follow symlinked directories
- -T --partial-only
- [WARNING: EXTREME RISK OF DATA LOSS, SEE CAVEATS] match based on
hash of first block of file data, ignoring the rest
- -u --printunique
- print only a list of unique (non-duplicate, unmatched) files
- -v --version
- display jdupes version and compilation feature flags
- -X --extfilter=spec:info
- exclude/filter files based on specified criteria; general format:
jdupes -X filter[:value][size_suffix]
Some filters take no value or multiple values. Filters that
can take a numeric option generally support the size multipliers
K/M/G/T/P/E with or without an added iB or B. Multipliers are
binary-style unless the -B suffix is used, which will use decimal
multipliers. For example, 16k or 16kib = 16384; 16kb = 16000.
Multipliers are case-insensitive.
Filters have cumulative effects: jdupes -X size+:99 -X
size-:101 will cause only files of exactly 100 bytes in size to be
included.
Extension matching is case-insensitive. Path substring
matching is case-sensitive.
Supported filters are:
- `size[+-=]:number[suffix]'
- match only if size is greater (+), less than (-), or equal to (=) the
specified number. The +/- and = specifiers can be combined, i.e.
"size+=:4K" will only consider files with a size greater than or
equal to four kilobytes (4096 bytes).
- `noext:ext1[,ext2,...]'
- exclude files with certain extension(s), specified as a comma-separated
list. Do not use a leading dot.
- `onlyext:ext1[,ext2,...]'
- only include files with certain extension(s), specified as a
comma-separated list. Do not use a leading dot.
- `nostr:text_string'
- exclude all paths containing the substring text_string. This scans the
full file path, so it can be used to match directories: -X
nostr:dir_name/
- `onlystr:text_string'
- require all paths to contain the substring text_string. This scans the
full file path, so it can be used to match directories: -X
onlystr:dir_name/
- `newer:datetime`
- only include files newer than specified date. Date/time format:
"YYYY-MM-DD HH:MM:SS" (time is optional).
- `older:datetime`
- only include files older than specified date. Date/time format:
"YYYY-MM-DD HH:MM:SS" (time is optional).
- -z --zeromatch
- consider zero-length files to be duplicates; this replaces the old default
behavior when -n was not specified
- -Z --softabort
- if the user aborts the program (as with CTRL-C) act on the matches that
were found before the abort was received. For example, if -L and -Z are
specified, all matches found prior to the abort will be hard linked. The
default behavior without -Z is to abort without taking any actions.
A set of arrows are used in hard linking to show what action was taken on each
link candidate. These arrows are as follows:
- ---->
- This file was successfully hard linked to the first file in the duplicate
chain
- -@@->
- This file was successfully symlinked to the first file in the chain
- -==->
- This file was already a hard link to the first file in the chain
- -//->
- Linking this file failed due to an error during the linking process
Duplicate files are listed together in groups with each file
displayed on a separate line. The groups are then separated from each other
by blank lines.
- jdupes a --recurse: b
- will follow subdirectories under b, but not those under a.
- jdupes a --recurse b
- will follow subdirectories under both a and b.
- jdupes -O dir1 dir3 dir2
- will always place 'dir1' results first in any match set (where relevant)
Using -1 or --one-file-system prevents matches that cross
filesystems, but a more relaxed form of this option may be added that allows
cross-matching for all filesystems that each parameter is present on.
When using -d or --delete, care should be taken to
insure against accidental data loss.
-Z or --softabort used to be --hardabort in jdupes
prior to v1.5 and had the opposite behavior. Defaulting to taking action on
abort is probably not what most users would expect. The decision to invert
rather than reassign to a different option was made because this feature was
still fairly new at the time of the change.
The -O or --paramorder option allows the user
greater control over what appears in the first position of a match set,
specifically for keeping the -N option from deleting all but one file
in a set in a seemingly random way. All directories specified on the command
line will be used as the sorting order of result sets first, followed by the
sorting algorithm set by the -o or --order option. This means
that the order of all match pairs for a single directory specification will
retain the old sorting behavior even if this option is specified.
When used together with options -s or --symlink, a
user could accidentally preserve a symlink while deleting the file it points
to.
The -Q or --quick option only reads each file once,
hashes it, and performs comparisons based solely on the hashes. There is a
small but significant risk of a hash collision which is the purpose of the
failsafe byte-for-byte comparison that this option explicitly bypasses. Do
not use it on ANY data set for which any amount of data loss is
unacceptable. This option is not included in the help text for the program
due to its risky nature. You have been warned!
The -T or --partial-only option produces results
based on a hash of the first block of file data in each file, ignoring
everything else in the file. Partial hash checks have always been an
important exclusion step in the jdupes algorithm, usually hashing the first
4096 bytes of data and allowing files that are different at the start to be
rejected early. In certain scenarios it may be a useful heuristic for a user
to see that a set of files has the same size and the same starting data,
even if the remaining data does not match; one example of this would be
comparing files with data blocks that are damaged or missing such as an
incomplete file transfer or checking a data recovery against known-good
copies to see what damaged data can be deleted in favor of restoring the
known-good copy. This option is meant to be used with informational actions
and can result in EXTREME DATA LOSS if used with options that delete
files, create hard links, or perform other destructive actions on data based
on the matching output. Because of the potential for massive data
destruction, this option MUST BE SPECIFIED TWICE to take effect and
will error out if it is only specified once.
Using the -C or --chunksize option to override I/O
chunk size can increase performance on rotating storage media by reducing
"head thrashing," reading larger amounts of data sequentially from
each file. This tunable size can have bad side effects; the default size
maximizes algorithmic performance without regard to the I/O characteristics
of any given device and uses a modest amount of memory, but other values may
greatly increase memory usage or incur a lot more system call overhead. Try
several different values to see how they affect performance for your
hardware and data set. This option does not affect match results in any way,
so even if it slows down the file matching process it will not hurt
anything.
Send bug reports to jody@jodybruchon.com or use the issue tracker at:
http://github.com/jbruchon/jdupes/issues
If you find this program useful, please consider financially supporting its
continued development by visiting the following URL:
https://www.subscribestar.com/JodyBruchon
jdupes is created and maintained by Jody Bruchon <jody@jodybruchon.com>
and was forked from fdupes 1.51 by Adrian Lopez <adrian2@caribe.net>
The MIT License (MIT)
Copyright (C) 2015-2020 Jody Lee Bruchon and contributors Forked
from fdupes 1.51, Copyright (C) 1999-2014 Adrian Lopez and contributors
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction,
including without limitation the rights to use, copy, modify, merge,
publish, distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to the
following conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
USE OR OTHER DEALINGS IN THE SOFTWARE.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |