coreutils: Sorting files for join

 
 8.3.2 Pre-sorting
 -----------------
 
 ‘join’ requires sorted input files.  Each input file should be sorted
 according to the key (=field/column number) used in ‘join’.  The
 recommended sorting option is ‘sort -k 1b,1’ (assuming the desired key
 is in the first column).
 
 Typical usage:
      $ sort -k 1b,1 file1 > file1.sorted
      $ sort -k 1b,1 file2 > file2.sorted
      $ join file1.sorted file2.sorted > file3
 
    Normally, the sort order is that of the collating sequence specified
 by the ‘LC_COLLATE’ locale.  Unless the ‘-t’ option is given, the sort
 comparison ignores blanks at the start of the join field, as in ‘sort
 -b’.  If the ‘--ignore-case’ option is given, the sort comparison
 ignores the case of characters in the join field, as in ‘sort -f’:
 
      $ sort -k 1bf,1 file1 > file1.sorted
      $ sort -k 1bf,1 file2 > file2.sorted
      $ join --ignore-case file1.sorted file2.sorted > file3
 
    The ‘sort’ and ‘join’ commands should use consistent locales and
 options if the output of ‘sort’ is fed to ‘join’.  You can use a command
 like ‘sort -k 1b,1’ to sort a file on its default join field, but if you
 select a non-default locale, join field, separator, or comparison
 options, then you should do so consistently between ‘join’ and ‘sort’.
 
 To avoid any locale-related issues, it is recommended to use the ‘C’
 locale for both commands:
 
      $ LC_ALL=C sort -k 1b,1 file1 > file1.sorted
      $ LC_ALL=C sort -k 1b,1 file2 > file2.sorted
      $ LC_ALL=C join file1.sorted file2.sorted > file3