How to Use Linux Command Sort to Sort the Text File According to 4Th Column, Numeric Order

how to use Linux command Sort to sort the text file according to 4th column, numeric order?

sort -nk4 file

-n for numerical sort
-k for providing key

or add -r option for reverse sorting

sort -nrk4 file

How to sort a text file by column and keep the original order

Option -s is what you need (equivalent to --stable ):

sort -k11,11 -d -s myfile.txt > sortedfile

The option -k works with a range of fields, so you should probably add ,11 as I did above, otherwise the sorting will use keys spanning from column 11 to the end of line (default).

Sorting data based on second column of a file

You can use the key option of the sort command, which takes a "field number", so if you wanted the second column:

sort -k2 -n yourfile

-n, --numeric-sort compare according to string numerical value

For example:

$ cat ages.txt 
Bob 12
Jane 48
Mark 3
Tashi 54

$ sort -k2 -n ages.txt
Mark 3
Bob 12
Jane 48
Tashi 54

How to sort a file in unix both alphabetically and numerically on different fields?

Try using like this:-

sort -k1,1 -k4,4n
  • -n : Makes the program sort according to numerical value
  • -k opts: Sort data / fields using the given column number. For example, the option -k 2 made the program sort using the second

    column of data. The option -k 3,3n -k 4,4n sorts each column. First

    it will sort 3rd column and then 4th column.

sorting an output in numeric order bash

Using GNU awk:

awk 'NR==1 { print;next } { split($4,map,":");map1[map[1]][map[2]][map[3]]=$0} END { PROCINFO["sorted_in"]="@ind_num_asc";for ( i in map1 ) { for ( j in map1[i]) { for ( k in map1[i][j]) { print map1[i][j][k] } } } }' file

Explanation:

awk 'NR==1 { 
print; # For the first line ( headers) print and skip to the next line
next
}
{ split($4,map,":"); # Split the fourth space separated field into the array map using ":" as the delimiter
map1[map[1]][map[2]][map[3]]=$0 # Create a 3 dimensional array called map with the first, second and third indexes of map as the indexes and the line as the value
}
END {
PROCINFO["sorted_in"]="@ind_num_asc"; # Set the array ordering (index number ascending)
for ( i in map1 ) {
for ( j in map1[i]) {
for ( k in map1[i][j]) {
print map1[i][j][k] # At the end of processing the file, loop through the map1 array and print the values.
}
}
}
}' file

Is it possible to sort a huge text file using Linux sort command by a number at the end of each line?

Maybe it is a bit tricky, but this mix of commands can make it:

awk '$1=$NF" "$1' file | sort -n | cut -d' ' -f2-

The main idea is that we print the file appending the last value in the front of the line, then we sort and we finally remove that value from the output.

  • awk '$1=$NF" "$1' file As the parameter you want to sort by is the last one in the file, let's print it also in the first field.
  • sort -n Then we pipe to sort -n, which sorts numerically.
  • cut -d' ' -f2- and we finally print out the value we temporally used.

Test

$ cat a
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
$ awk '$1=$NF" "$1' a | sort -n | cut -d' ' -f2-
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89

Showing each step:

$ awk '$1=$NF" "$1' a 
6 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
79 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
19 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
8 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
89 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
$ awk '$1=$NF" "$1' a | sort -n
6 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
8 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
19 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
79 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
89 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
$ awk '$1=$NF" "$1' a | sort -n | cut -d' ' -f2-
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89

Sorting multiple keys with Unix sort

Use the -k option (or --key=POS1[,POS2]). It can appear multiple times and each key can have global options (such as n for numeric sort)

How to sort a file, based on its numerical values for a field?

Take a peek at the man page for sort...

   -n, --numeric-sort
compare according to string numerical value

So here is an example...

sort -n filename

Shell script - Sorting text file by column that will & will not exist

Assuming there won't be embedded commas before the sort field, you could use awk to help with the sort and remove any extraneous information afterward:

awk -F',' -vi=0 '
/^INSERT/ { i=$4 }
{ printf("%08d%08d\t%s\n", i, NR, $0) }
' < data | sort | sed -e 's/^[^\t]*\t//' > newdata

That just feeds an input file "data" to awk, which will insert a formatted copy of the 4th field before the data itself in the form XXXXXXXX.NNNNNNNN, where XXXXXXXX represents an 8-digit representation of the value in the fourth field of the last INSERT ... line found, and NNNNNNNN represents the record number (line number) formatted as an 8-digit value. The original data and the formatted data is separated by a tab character. A special case in which the file doesn't start with INSERT (e.g. a blank line, a comment explaining the contents of the file, etc.) is treated as if the last INSERT line had a fourth field with a value of 0. Here's a clipped sample of the resulting output with some additional lines inserted for testing:

0000000000000001    # This file contains data.
0000001600000002 INSERT,SLT_TEST_5,1218738496,16,DEBUG3,,DEBUG_LEVEL1
0000001600000003 <v s=""MONTHLY_PEAK_DWNLOAD""/>
0000001600000004 </a><a n=""thresholdScheme"">
0000001600000005 <o t=""PM_UsageMonitorConfigThreshold"">
0000001800000006 INSERT,SLT_TEST_1,1218738496,18,DEBUG3,,DEBUG_LEVEL4
0000001800000007 <v s=""ORANGE""/>
0000000500000008 INSERT,SLT_TEST_3,5555738111,5,DEBUG3,,DEBUG_LEVEL1
0000000700000009 INSERT,SLT_TEST_1,9998738342,7,DEBUG3,,DEBUG_LEVEL2
0000000700000010 I'm a little teapot.

The resulting lines are passed to sort, which will sort the lines it receives. sed will then strip the lines that the sort utility outputs of the information the awk script originally created, resulting in a sort based on the fourth field without altering the order of the lines that don't begin with INSERT, which is output to a file "newdata":

# This file contains data.
INSERT,SLT_TEST_3,5555738111,5,DEBUG3,,DEBUG_LEVEL1
INSERT,SLT_TEST_1,9998738342,7,DEBUG3,,DEBUG_LEVEL2
I'm a little teapot.
INSERT,SLT_TEST_5,1218738496,16,DEBUG3,,DEBUG_LEVEL1
<v s=""MONTHLY_PEAK_DWNLOAD""/>
</a><a n=""thresholdScheme"">
<o t=""PM_UsageMonitorConfigThreshold"">
INSERT,SLT_TEST_1,1218738496,18,DEBUG3,,DEBUG_LEVEL4
<v s=""ORANGE""/>


Related Topics



Leave a reply



Submit