Sorting multiple keys with Unix sort
Use the -k
option (or --key=POS1[,POS2]
). It can appear multiple times and each key can have global options (such as n
for numeric sort)
sorting with multiple keys with Linux sort command
I find this caution in the GNU sort docs.
Sort numerically on the second field and resolve ties by sorting
alphabetically on the third and fourth characters of field five. Use
‘:’ as the field delimiter.sort -t : -k 2,2n -k 5.3,5.4
Note that if you had written -k 2n instead of -k 2,2n sort would have
used all characters beginning in the second field and extending to the
end of the line as the primary numeric key. For the large majority of
applications, treating keys spanning more than one field as numeric
will not do what you expect.
I'm not sure what it ends up with when it evaluates '1001 3' as a numeric key, but "will not do what you expect" is accurate. It seems clear that the Right Thing to do is to specify each key independently.
The same web page says this about resolving "ties".
Finally, as a last resort when all keys compare equal, sort compares
entire lines as if no ordering options other than --reverse (-r) were
specified.
I'll confess I'm a little mystified about how to interpret that.
Sorting multiple keys with Unix sort -- Bug?
-k2
uses all the characters from the beginning of the 2nd field to the end of the line, because you did not specify where the key ends. So the lines
0.322_rsrc:15_phi:0.5_abr:1_prof:gauss_diff:lap2.dat 0.000110687417806 0.0346076270248
0.3_rsrc:15_phi:0.5_abr:1_prof:gauss_diff:lap2.dat 0.000111161259827 0.0358869210331
are correctly sorted because in both keys begin with _rsrc:15
and 0.000110
sorts before 0.000111
. The key phrase in the manual page is
KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where F is a field number and C a character position in the field; both are origin 1, and the stop position defaults to the line's end.
unix sorting, with primary and secondary keys
The manual shows some examples.
In accordance with zseder's comment, this works:
sort -t"<TAB>" -k1,1d -k3,3g
Tab should theoretically work also like this sort -t"\t"
.
If none of the above work to delimit by tab, this is an ugly workaround:
TAB=`echo -e "\t"`
sort -t"$TAB"
Sorting a file by multiple columns using bash sort
You're missing the -n
/--numeric-sort
option, to sort according to string numerical value, not lexicographically (at least for second and third field):
$ sort -k1,1 -k2,2n -k3,3n file.txt
word01.1 5 8
word01.1 10 20
word01.1 10 30
word01.1 40 50
word01.2 10 25
word01.2 30 50
word01.2 40 50
Note that you can provide a global -n
flag, to sort all fields as numerical values, or per key. Format for key is -k KEYDEF
, where KEYDEF
is F[.C][OPTS][,F[.C][OPTS]]
and OPTS
is one or more of ordering options, like n
(numerical), r
(reverse), g
(general numeric), h
(human numeric), etc.
unix sort multiple fields
You need one of:
sort --key=1,1 --key=2,2r --key=3,3 --key=4,4r
sort -k1,1 -k2,2r -k3,3 -k4,4r
as in the following transcript:
pax$ echo '5 3 2 9
3 4 1 7
5 2 3 1
6 1 3 6
1 2 4 5
3 1 2 3
5 2 2 3' | sort --key=1,1 --key=2,2r --key=3,3 --key=4,4r
1 2 4 5
3 4 1 7
3 1 2 3
5 3 2 9
5 2 2 3
5 2 3 1
6 1 3 6
Remember to provide the -n
option if you want them treated as proper numbers (variable length), such as:
sort -n -k1,1 -k2,2r -k3,3 -k4,4r
sort alphanumerically with priority for numbers in linux
So, basically, you're asking to sort the first field numerically in descending order, but if the numeric keys are the same, you want the second field to be ordered in natural, or ascending, order.
I tried a few things, but here's the way I managed to make it work:
sort -nk2 file.txt | sort -snrk1
Explanation:
The first command sorts the whole file using the second, alphanumeric field in natural order, while the second command sorts the output using the first numeric field, shows it in reverse order, and requests that it be a "stable" sort.
-n
is for numeric sort, versus alphanumeric, in which 6 would come before 60.-r
is for reversed order, so from highest to lowest. If unspecified, it will assume natural, or ascending, order.-k
which key, or field, to use for sorting order.-s
for stable ordering. This option maintains the original record order of records that have an equal key.
unix sort by single column only
From the POSIX description of sort
:
Except when the -u option is specified, lines that otherwise compare equal shall be ordered as if none of the options -d, -f, -i, -n, or -k were present (but with -r still in effect, if it was specified) and with all bytes in the lines significant to the comparison. The order in which lines that still compare equal are written is unspecified.
So in your case, when two lines have the same value in the second column and thus are equal, the entire lines are then compared to get the final ordering.
GNU sort
(And possibly other implementations, but it's not mandated by POSIX) has the -s
option for a stable sort where lines with keys that compare equal appear in the same order as in the original, which is what it appears you want:
$ sort -t, -s -k2,2n chris.num
1,4,3
1,4,1
1,5,2
1,7,2
1,7,1
Related Topics
What Is Eof!! in The Bash Script
Os X Permission Denied for /Usr/Local/Lib
Openssl/Rsa - Using a Public Key to Decrypt
On-The-Fly Output Redirection, Seeing The File Redirection Output While The Program Is Still Running
Check If All Lines from One File Are Present Somewhere in Another File
Securing Udp - Openssl or Gnutls or ...
Linux: How to Debug a Sigsegv? How to Trace The Error Source
Why Processes Are Deprived of CPU for Too Long While Busy Looping in Linux Kernel
Docker Run Hello-World Still Fails, Permission Denied
How to Pass Env Variables Between Make Targets
Principle of Qemu CPU Emulation
Find Files Modified Over 1 Hour Ago But Less Than 3 Days
How Does Boost Asio's Hostname Resolution Work on Linux? How to Use Nss
What Does an Asterisk at The End of a Mv Command Do
Command to Check Status of Message Queue and Shared Memory in Linux
How to Get The Bash Date Script to Return a Day of The Week Relative to a Non-Current Time