|
NAMEndiff - compare putatively similar files, ignoring small numeric differencesSYNOPSISndiff [ -? ] [ -abserr abserr ] [ -author ] [ -copyright ] [ -fields n1a-n1b,n2,n3a-n3b,... ] [ -help ] [ -logfile filename ] [ -minwidth nnn ] [ -outfile filename ] [ -precision number-of-bits ] [ -quick ] [ -quiet ] [ -relerr relerr ] [ -separators regexp ] [ -silent ] [ -version ] [ -www ] infile1 infile2DESCRIPTIONWhen a numerical program is run in multiple environments (operating systems, architectures, or compilers), assessing its consistency can be a difficult task for a human, since small differences in numerical output values are expected.Application of a file differencing utility, such as POSIX/UNIX diff(1), will generally produce voluminous output, often longer than the original files, and is thus not useful. The lesser-known UNIX spiff(1) utility, while capable of handling numeric fields, suffers from excessively-long running times, and often terminates prematurely. ndiff provides a solution to this problem. It compares two files that are expected to be identical, or at least, numerically similar. It assumes that lines consist of whitespace-separated fields of numeric and non-numeric data. A hyphen (minus sign) can be used in place of either input filename to represent stdin, allowing one input stream to come from a UNIX pipe. This is a common, but by no means universal, idiom in UNIX software as a workaround for the regrettable lack of standard names for the default stdin and stdout streams. On some, but not all, UNIX systems, stdin can be named explicitly as /dev/stdin or /dev/fd/0. The default field separator characters can be modified with the -separators regexp command-line option, so that ndiff can also handle files with, e.g., parenthesized complex numbers, and comma-separated numbers from Fortran list-directed output. However, because line breaking and use of repeats counts in Fortran list-directed is implementation dependent, such files are not really suitable for cross-implementation file comparisons, unless the lists are kept short enough to fit on a single line. ndiff expects the files to contain the same number of lines; otherwise, a diagnostic will be issued. Unlike diff(1), this program cannot handle inserted or deleted lines. Also unlike diff(1) (unless diff's -b and -w options are used), whitespace is not significant for ndiff, except that it normally separates fields. Lines that differ in at least one field (as determined by the absolute and/or relative tolerances, for numeric values, or string comparisons otherwise) are reported on stdout in a diff(1) -style listing of the form nnncnnn
The first of these lines shows the line number twice, separated by the letter
c (for change). The second and fourth lines begin with a two-character
identifying prefix. The third, separator, line shows the field number at which
the difference was found; fields beyond that one may also differ, but have not
been checked. If the differing field is numeric, then the errors found are
also shown on that line. If the relative error is not too big, its value is
also shown as a multiple of the machine epsilon.
< line from infile1 --- field n absolute error x.xxe-xx relative error x.xxe-xx [nn*(machine epsilon)] > line from infile2 ndiff recognizes the following patterns as valid numbers. In the patterns, # is a string of one or more decimal digits, optionally separated by a nonsignificant underscore (as in the Ada programming language), s is an optional + or - sign, and X is an exponent letter, one of D, d, E, e, Q, or q: s# s#s# s#Xs# s#. s#.s# s#.Xs# s#.# s#.#s# s#.#Xs# s.# s.#s# s.#Xs# The rigorous programming rule that determines whether a string is interpreted as a floating-point value is that it must match this very complicated regular expression (the line breaks are for readability only): "^[-+]?([0-9](_?[0-9])*([.]?([0-9](_?[0-9])*)*)?| [.][0-9](_?[0-9])*+) ([DdEeQq]?[-+]?[0-9](_?[0-9])*)?$" Thus, 123, -1q-27, .987d77, 3.14159_26535_89793_23846, and .456-123 are all valid numbers. Notably absent from this list are Fortran-style numbers with embedded blanks (blanks are not significant in Fortran, except in string constants). If your files contain such data, then you must convert them to standard form first, if you want ndiff to perform reliably. In the interests of interlanguage data exchange, most modern Fortran implementations do not output floating-point numbers with embedded spaces, so you should rarely need such file conversions. From version 2.00, ndiff also recognizes patterns for optionally-signed NaN (Not-a-Number): NaN SNaN QNaN NaNS NaNQ ?.0e+0 ??.0 +NaN +SNaN +QNaN +NaNS +NaNQ +?.0e+0 +??.0 -NaN -SNaN -QNaN -NaNS -NaNQ -?.0e+0 -??.0 Inf Infinity +.+0e+0 +.+0 +Inf +Infinity +.+0e+0 +.+0 -Inf -Infinity -.-0e+0 -.-0 Lettercase is not significant in these values. The rigorous programming rule for whether a field is a NaN or an Infinity is determined by these complex regular expressions (again, the line breaks are for readability only): "^[-+]?([QqSs]?[Nn][Aa][Nn][QqSs]?| [?]+[.][?0]+[DdEeQq][-+]?[0-9]+| [?]+[.][?0]+)$" "^(-[Ii][Nn][Ff]| -[Ii][Nn][Ff][Ii][Nn][Ii][Tt][Yy]| -+[.][-]0+[DdEeQq][-+]?[0-9]+| -+[.][-]0+)$" "^([+]?[Ii][Nn][Ff]| [+]?[Ii][Nn][Ff][Ii][Nn][Ii][Tt][Yy]| [+]+[.][-]0+[DdEeQq][-+]?[0-9]+| [+]+[.][-]0+)$" Even though in numerical computations, a NaN is never equal to anything, even itself, for ndiff, fields that match a NaN pattern are considered equal. Fields that match Infinity patterns are considered equal if they have the same sign. ndiff terminates with a success exit code (on UNIX, 0) if no differences (subject to the absolute and/or relative tolerances) are found. Otherwise, it terminates with a failure exit code (on UNIX, 1). OPTIONSCommand-line options may be abbreviated to a unique leading prefix, and letter case is ignored.To avoid confusion with options, if a filename begins with a hyphen, it must be disguised by a leading absolute or relative directory path, e.g., /tmp/-foo.dat or ./-foo.dat. GNU- and POSIX-style options of the form --name are also recognized: they begin with two option prefix characters.
CAVEATSThis implementation of ndiff can be built with support for double-precision, quadruple-precision, or multiple-precision arithmetic. The -version option reports the particular choice at your site. Thus, ndiff will not correctly handle absolute and relative error tolerances that are smaller than those corresponding to the machine epsilon in the arithmetic for which it was built, and for that reason, installers are encouraged to build the multiple-precision version, so that users can select any required precision.WISH LISTIt would be nice to have ndiff's abilities incorporated into the GNU diff(1) program; that way, numeric fields could be successfully compared even in files with inserted or deleted lines, and much of the entire computing world could benefit.Perhaps some community-minded and clever reader of this documentation will take up this challenge, and present the Free Software Foundation with an improved diff(1) implementation that offers support for tolerant differencing of numeric files, using ndiff as a design model, sample implementation, and testbed! Ideally, such an improved diff(1) implementation should handle numbers of arbitrary precision, allowing comparisons of numeric output from systems that support high-precision arithmetic, such as Lisp and symbolic algebra languages. In addition, it might choose to do its arithmetic in decimal floating-point, so as to avoid inaccuracies introduced by vendor-dependent libraries for decimal-to-native-base number conversion. The awk(1) prototype version of ndiff supports only double-precision arithmetic; the C version is more flexible. FILESIn the following, LIBDIR represents the name of the ndiff installation directory; it is not a user-definable environment variable. If ndiff has been installed properly at your site, the value of LIBDIR is/wrkdirs/usr/ports/math/ndiff/work/stage/usr/local/share/ndiff
SEE ALSOawk(1), bawk(1), cmp(1), diff(1), gawk(1), mawk(1), nawk(1), spiff(1).AUTHORNelson H. F. Beebe Center for Scientific Computing University of Utah Department of Mathematics, 322 INSCC 155 S 1400 E RM 233 Salt Lake City, UT 84112-0090 USA Email: beebe@math.utah.edu, beebe@acm.org, beebe@computer.org, beebe@ieee.org (Internet) WWW URL: http://www.math.utah.edu/~beebe Telephone: +1 801 581 5254 FAX: +1 801 585 1640, +1 801 581 4148 AVAILABILITYndiff is freely available; its master distribution can be found atftp://ftp.math.utah.edu/pub/misc/ http://www.math.utah.edu/pub/misc/ in the file ndiff-x.yy.tar.gz where x.yy is the current version. Other distribution formats are usually available at the same location. That site is mirrored to several other Internet archives, so you may also be able to find it elsewhere on the Internet; try searching for the string ndiff at one or more of the popular Web search sites, such as http://altavista.digital.com/ http://search.microsoft.com/us/default.asp http://www.dejanews.com/ http://www.dogpile.com/index.html http://www.euroseek.net/page?ifl=uk http://www.excite.com/ http://www.go2net.com/search.html http://www.google.com/ http://www.hotbot.com/ http://www.infoseek.com/ http://www.inktomi.com/ http://www.lycos.com/ http://www.northernlight.com/ http://www.snap.com/ http://www.stpt.com/ http://www.yahoo.com/ COPYRIGHT######################################################################## ######################################################################## ######################################################################## ### ### ### ndiff: compare putatively similar files, ignoring small numeric ### ### differences ### ### ### ### Copyright (C) 2000 Nelson H. F. Beebe ### ### ### ### This program is covered by the GNU General Public License (GPL), ### ### version 2 or later, available as the file COPYING in the program ### ### source distribution, and on the Internet at ### ### ### ### ftp://ftp.gnu.org/gnu/GPL ### ### ### ### http://www.gnu.org/copyleft/gpl.html ### ### ### ### This program is free software; you can redistribute it and/or ### ### modify it under the terms of the GNU General Public License as ### ### published by the Free Software Foundation; either version 2 of ### ### the License, or (at your option) any later version. ### ### ### ### This program is distributed in the hope that it will be useful, ### ### but WITHOUT ANY WARRANTY; without even the implied warranty of ### ### MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ### ### GNU General Public License for more details. ### ### ### ### You should have received a copy of the GNU General Public ### ### License along with this program; if not, write to the Free ### ### Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, ### ### MA 02111-1307 USA. ### ######################################################################## ######################################################################## ########################################################################
Visit the GSP FreeBSD Man Page Interface. |