coreutils: Version sort ignores locale
30.2.6 Version sort uses ASCII order, ignores locale, unicode characters
------------------------------------------------------------------------
In version sort, unicode characters are compared byte-by-byte according
to their binary representation, ignoring their unicode value or the
current locale.
Most commonly, unicode characters (e.g. Greek Small Letter Alpha
U+03B1 ‘α’) are encoded as UTF-8 bytes (e.g. ‘α’ is encoded as UTF-8
sequence ‘0xCE 0xB1’). The encoding will be compared byte-by-byte, e.g.
first ‘0xCE’ (decimal value 206) then ‘0xB1’ (decimal value 177).
$ touch aa az "a%" "aα"
$ ls -1 -v
aa
az
a%
aα
Ignoring the first letter (‘a’) which is identical in all strings,
the compared values are:
‘‘a’’ and ‘‘z’’ are letters, and sort earlier than all other
non-digit characters.
Then, percent sign ‘‘%’’ (ASCII value 37) is compared to the first
byte of the UTF-8 sequence of ‘‘α’’, which is 0xCE or 206). The value
37 is smaller, hence ‘‘a%’’ is listed before ‘‘aα’’.