diffutils: Overview

 
 Overview
 ********
 
 Computer users often find occasion to ask how two files differ.  Perhaps
 one file is a newer version of the other file.  Or maybe the two files
 started out as identical copies but were changed by different people.
 
    You can use the 'diff' command to show differences between two files,
 or each corresponding file in two directories.  'diff' outputs
 differences between files line by line in any of several formats,
 selectable by command line options.  This set of differences is often
 called a "diff" or "patch".  For files that are identical, 'diff'
 normally produces no output; for binary (non-text) files, 'diff'
 normally reports only that they are different.
 
    You can use the 'cmp' command to show the byte and line numbers where
 two files differ.  'cmp' can also show all the bytes that differ between
 the two files, side by side.  A way to compare two files character by
 character is the Emacs command 'M-x compare-windows'.  ⇒Other
 Window (emacs)Other Window, for more information on that command.
 
    You can use the 'diff3' command to show differences among three
 files.  When two people have made independent changes to a common
 original, 'diff3' can report the differences between the original and
 the two changed versions, and can produce a merged file that contains
 both persons' changes together with warnings about conflicts.
 
    You can use the 'sdiff' command to merge two files interactively.
 
    You can use the set of differences produced by 'diff' to distribute
 updates to text files (such as program source code) to other people.
 This method is especially useful when the differences are small compared
 to the complete files.  Given 'diff' output, you can use the 'patch'
 program to update, or "patch", a copy of the file.  If you think of
 'diff' as subtracting one file from another to produce their difference,
 you can think of 'patch' as adding the difference to one file to
 reproduce the other.
 
    This manual first concentrates on making diffs, and later shows how
 to use diffs to update files.
 
    GNU 'diff' was written by Paul Eggert, Mike Haertel, David Hayes,
 Richard Stallman, and Len Tower.  Wayne Davison designed and implemented
 the unified output format.  The basic algorithm is described by Eugene
 W. Myers in "An O(ND) Difference Algorithm and its Variations",
 'Algorithmica' Vol. 1, 1986, pp. 251-266,
 <http://dx.doi.org/10.1007/BF01840446>; and in "A File Comparison
 Program", Webb Miller and Eugene W. Myers, 'Software--Practice and
 Experience' Vol. 15, 1985, pp. 1025-1040,
 <http://dx.doi.org/10.1002/spe.4380151102>.  The algorithm was
 independently discovered as described by Esko Ukkonen in "Algorithms for
 Approximate String Matching", 'Information and Control' Vol. 64, 1985,
 pp. 100-118, <http://dx.doi.org/10.1016/S0019-9958(85)80046-2>.  Unless
 the '--minimal' option is used, 'diff' uses a heuristic by Paul Eggert
 that limits the cost to O(N^1.5 log N) at the price of producing
 suboptimal output for large inputs with many differences.  Related
 algorithms are surveyed by Alfred V. Aho in section 6.3 of "Algorithms
 for Finding Patterns in Strings", 'Handbook of Theoretical Computer
 Science' (Jan Van Leeuwen, ed.), Vol. A, 'Algorithms and Complexity',
 Elsevier/MIT Press, 1990, pp. 255-300.
 
    GNU 'diff3' was written by Randy Smith.  GNU 'sdiff' was written by
 Thomas Lord.  GNU 'cmp' was written by Torbjörn Granlund and David
 MacKenzie.
 
    GNU 'patch' was written mainly by Larry Wall and Paul Eggert; several
 GNU enhancements were contributed by Wayne Davison and David MacKenzie.
 Parts of this manual are adapted from a manual page written by Larry
 Wall, with his permission.