Sort by Third Column Leaving First and Second Column Intact in Linux

Sort by third column leaving first and second column intact in Linux?

Try this:

sort  -t: -k1,1 -k3 data.txt

gives:

bast:disp-san-d5-06:piranha 
bast:display-san-12:redbird
bast:display-san-07:waverider
bast:display-san-12:waverider

This will sort with the 1st field as primary key, and the 3rd field as secondary key splitting the line into fields by :

Details:

data.txt contains the 4 lines from your post.

You can specify multiple fields as sorting keys, see the man page

-k1,1 means sort on the first field (start at field 1 and end at field 1, otherwise it would continue using the rest of the line for determining the sort)

-k3 means sort on the 3rd field as secondary key. Since there are no other fields behind it is not necessary to specify -k3,3 but it wouldn't hurt either.

-t: means delimit fields in lines with the : character, otherwise blank is used by default

More information see this SO question Sorting multiple keys with Unix sort and the sort man page

sort -n places 10 before 9

Try this

sort -n -t: -k3 file.txt -o out.txt

Sorting based on first column and highest number in third column

You sort them based on the third column first and later unique them by first column.

sort -r -k 1 -k3n,3 Black.txt|sort -u -k1,1

output

1  ghi  100
2 jui 500
3 yui 500

re-order first and second column based on third while keeping it intact

Try using rank and cbind

> cbind(df[rank(df$V3), -3], V3=df$V3)
V1 V2 V3
5 snex id3B id3B
1 kex id1A id1A
6 dex id3C id3C
2 sex id1B id1B
3 hex id2A ID2A
4 flex id3A id3A

Sort 3rd column when others are already sorted

Assumptions/Understandings:

  • 1st column has already been sorted based on a 'V'ersion sort
  • we need to maintain the ordering of the 1st column, then sort duplicates by the 3rd column

Adding a few rows to our sample data:

$ cat input.dat
PB.1060.1_1_1000 Chr1 484 817 20733209
PB.1060.1_1_1000 Chr1 1 293 20733996
PB.1060.1_1_1000 Chr1 287 485 20733577
PB.1060.1_2_1001 Chr1 483 816 20733209
PB.1060.1_2_1001 Chr1 286 484 20733577
PB.1060.1_100_1099 Chr1 905 423 20733234
PB.1060.1_100_1099 Chr1 1020 523 20734234
PB.1060.1_1000_1999 Chr1 3422 223 20731234
PB.1060.1_1000_1999 Chr1 200 323 20732234
PB.1060.1_1001_2000 Chr1 900 623 20735234

One sort idea:

sort -k1,1V -k3,3n input.dat

Where:

  • apply a 'V'ersion sort to the 1st column
  • sort the 3rd column as a 'n'umber

This generates:

PB.1060.1_1_1000        Chr1      1       293     20733996
PB.1060.1_1_1000 Chr1 287 485 20733577
PB.1060.1_1_1000 Chr1 484 817 20733209
PB.1060.1_2_1001 Chr1 286 484 20733577
PB.1060.1_2_1001 Chr1 483 816 20733209
PB.1060.1_100_1099 Chr1 905 423 20733234
PB.1060.1_100_1099 Chr1 1020 523 20734234
PB.1060.1_1000_1999 Chr1 200 323 20732234
PB.1060.1_1000_1999 Chr1 3422 223 20731234
PB.1060.1_1001_2000 Chr1 900 623 20735234

unix sort by single column only

From the POSIX description of sort:

Except when the -u option is specified, lines that otherwise compare equal shall be ordered as if none of the options -d, -f, -i, -n, or -k were present (but with -r still in effect, if it was specified) and with all bytes in the lines significant to the comparison. The order in which lines that still compare equal are written is unspecified.

So in your case, when two lines have the same value in the second column and thus are equal, the entire lines are then compared to get the final ordering.

GNU sort (And possibly other implementations, but it's not mandated by POSIX) has the -s option for a stable sort where lines with keys that compare equal appear in the same order as in the original, which is what it appears you want:

$ sort -t, -s -k2,2n chris.num
1,4,3
1,4,1
1,5,2
1,7,2
1,7,1

How to sort alphanumeric column in unix with alphabet first and then numeric

Replace -gk2,2 with -k2g (you meant the -k option, right?) , then add -k2

sort -k2g -k2 file

The key definition syntax is F[.C][OPTS][,F[.C][OPTS]], where F is the field number (2, in particular); OPTS is one or more single-letter ordering options [bdfgiMhnRrV].

Note, you don't need the ,2, as it means to stop sorting at the second column, and the second column is the last.

The key option priorities are applied in the order you passed them in the command, i.e. -k2g is applied first, then -k2.



Related Topics



Leave a reply



Submit