Sort by Multiple Columns in Bash

Sorting a file by multiple columns using bash sort

You're missing the -n/--numeric-sort option, to sort according to string numerical value, not lexicographically (at least for second and third field):

$ sort -k1,1 -k2,2n -k3,3n file.txt
word01.1 5 8
word01.1 10 20
word01.1 10 30
word01.1 40 50
word01.2 10 25
word01.2 30 50
word01.2 40 50

Note that you can provide a global -n flag, to sort all fields as numerical values, or per key. Format for key is -k KEYDEF, where KEYDEF is F[.C][OPTS][,F[.C][OPTS]] and OPTS is one or more of ordering options, like n (numerical), r (reverse), g (general numeric), h (human numeric), etc.

Sorting multiple columns with delimiter . in bash

You got a few problems, the -k is defined as: -kfield1[,field2], which means -k2,1 is invalid. You can use -k1 -k2 -k3 to sort on the first three fields.

You properly want to do numeric sort on your fields, this can be enabled with -n see man 1 sort for other numeric sorting options:

$ sort -t. -n -k1 -k2 -k3 file.txt
2020.5.18
2020.10.1
2020.10.4
2021.1.1
2023.12.14
2023.14.1

Might work for you.

In case these are actually versions and not dates, then -V might be sufficient.

How do I sort 2 columns in shell script?

sort -u -k1.1,1.3M -k1.4n -k2V filename

-u

delete duplicate lines

-k1.1,1.3M

sort each line from word 1, character 1 to word 1, character 3 in month mode

-k1.4n

sort each line from word 1, character 4 until end of word 1 by numeric value

-k2V

sort second word in "version number" mode, which works well for the timestamp

sort in bash problem with multiple columns

Okay I had to search a bit, but here is the solution:

sort -k10nr,10 -k7Mr,7 -k8nr,8 -k9r,9

which outputs

 (1) (260, 03) 0x4461e 0x4461e1f Thu Sep 27 10:31:05 2026
(1) (260, 02) 0x2c435 0x2c43501 Thu Sep 27 09:43:21 2018
(1) (260, 02) 0x2c435 0x2c43501 Thu Mar 29 09:43:21 2018
(1) (260, 02) 0x2c435 0x2c43503 Thu Mar 03 09:46:44 2018
(1) (260, 02) 0x2c435 0x2c43503 Thu Mar 01 09:46:14 2018

How does it work?

  1. -k10nr,10: do a reverse numeric sort on the 10th field which represents the year
  2. -k7Mr,7: the second sort is a month-sort which uses the standard 3 character month notation for sorting, again reversed.
  3. k8nr,8: the third sort is on the day of the month which is again numeric reversed.
  4. -k9r,9: finally sort the 9 fields as a string which represents the time, remarkably reversed.

Why is it so complicated? From the moment you define one modifier, in the key-specification, the global flags -r or -n or whatever no longer apply. So it is not possible to mix global flags with local key modifiers.

How to sorting multiple columns from a CSV file

It's not clear if the file is TSV (tab separated), or CSV (comma separated). Question indicate CSV, but answer using tab delimiter (-t \t). Try -t, for CSV. Also the reverse order need to be applied to each key ('r' suffix on each key).

sort -k3,3nr -k4,4nr -t, data.csv


Related Topics



Leave a reply



Submit