![]() |
![]()
| ![]() |
![]()
NAMEdiff - compute a shortest edit script (SES) given two sequences SYNOPSIS#include <mba/diff.h> typedef const void *(*idx_fn)(const void *s, int idx, void *context); typedef int (*cmp_fn)(const void *e1, const void *e2, void *context); typedef enum { DIFF_MATCH = 1, DIFF_DELETE, DIFF_INSERT } diff_op; struct diff_edit { short op; /* DIFF_MATCH, DIFF_DELETE or DIFF_INSERT */ int off; /* off into a if MATCH or DELETE, b if INSERT */ int len; }; int diff(const void *a, int aoff, int n, const void *b, int boff, int m, idx_fn idx, cmp_fn cmp, void *context, int dmax, struct varray *ses, int *sn, struct varray *buf); DESCRIPTIONThe diff(3m) module will compute theshortest edit script(SES) of two sequences. This algorithm is perhaps best known as the one used in GNU diff(1) although GNU diff employs additional optimizations specific to line oriented input such as source code files whereas this implementation is more generic. Formally, this implementation of the SES solution uses the dynamic programming algorithm described by Myers [1] with the Hirschberg linear space refinement. The objective is to compute the minimum set of edit operations necessary to transform a sequence A of length N into B of length M. This can be performed in O(N+M*D^2) expected time where D is theedit distance(number of elements deleted and inserted to transform A into B). Thus the algorithm is particularly fast and uses less space if the difference between sequences is small. The interface is generic such that sequences can be anything that can be indexed and compared with user supplied functions including arrays of structures, linked lists, arrays of pointers to strings in a file, etc. [1] E. Myers, ``An O(ND) Difference Algorithm and Its Variations,'' Algorithmica 1, 2 (1986), 251-266. http://www.cs.arizona.edu/people/gene/PAPERS/diff.ps
RETURNS
|