how to use Linux command Sort to sort the text file according to 4th column, numeric order?
sort -nk4 file
-n for numerical sort
-k for providing key
or add -r option
for reverse sorting
sort -nrk4 file
How to sort a text file by column and keep the original order
Option -s
is what you need (equivalent to --stable
):
sort -k11,11 -d -s myfile.txt > sortedfile
The option -k
works with a range of fields, so you should probably add ,11
as I did above, otherwise the sorting will use keys spanning from column 11 to the end of line (default).
Sorting data based on second column of a file
You can use the key
option of the sort
command, which takes a "field number", so if you wanted the second column:
sort -k2 -n yourfile
-n
,--numeric-sort
compare according to string numerical value
For example:
$ cat ages.txt
Bob 12
Jane 48
Mark 3
Tashi 54
$ sort -k2 -n ages.txt
Mark 3
Bob 12
Jane 48
Tashi 54
How to sort a file in unix both alphabetically and numerically on different fields?
Try using like this:-
sort -k1,1 -k4,4n
- -n : Makes the program sort according to numerical value
- -k opts: Sort data / fields using the given column number. For example, the option -k 2 made the program sort using the second
column of data. The option -k 3,3n -k 4,4n sorts each column. First
it will sort 3rd column and then 4th column.
sorting an output in numeric order bash
Using GNU awk:
awk 'NR==1 { print;next } { split($4,map,":");map1[map[1]][map[2]][map[3]]=$0} END { PROCINFO["sorted_in"]="@ind_num_asc";for ( i in map1 ) { for ( j in map1[i]) { for ( k in map1[i][j]) { print map1[i][j][k] } } } }' file
Explanation:
awk 'NR==1 {
print; # For the first line ( headers) print and skip to the next line
next
}
{ split($4,map,":"); # Split the fourth space separated field into the array map using ":" as the delimiter
map1[map[1]][map[2]][map[3]]=$0 # Create a 3 dimensional array called map with the first, second and third indexes of map as the indexes and the line as the value
}
END {
PROCINFO["sorted_in"]="@ind_num_asc"; # Set the array ordering (index number ascending)
for ( i in map1 ) {
for ( j in map1[i]) {
for ( k in map1[i][j]) {
print map1[i][j][k] # At the end of processing the file, loop through the map1 array and print the values.
}
}
}
}' file
Is it possible to sort a huge text file using Linux sort command by a number at the end of each line?
Maybe it is a bit tricky, but this mix of commands can make it:
awk '$1=$NF" "$1' file | sort -n | cut -d' ' -f2-
The main idea is that we print the file appending the last value in the front of the line, then we sort and we finally remove that value from the output.
awk '$1=$NF" "$1' file
As the parameter you want to sort by is the last one in the file, let's print it also in the first field.sort -n
Then we pipe tosort -n
, which sorts numerically.cut -d' ' -f2-
and we finally print out the value we temporally used.
Test
$ cat a
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
$ awk '$1=$NF" "$1' a | sort -n | cut -d' ' -f2-
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
Showing each step:
$ awk '$1=$NF" "$1' a
6 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
79 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
19 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
8 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
89 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
$ awk '$1=$NF" "$1' a | sort -n
6 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
8 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
19 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
79 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
89 ! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
$ awk '$1=$NF" "$1' a | sort -n | cut -d' ' -f2-
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 6
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 8
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 19
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 79
! ! ! ! ! ||| ! ||| 1.25846e-05 0.248369 3.02708e-07 0.662955 2.718 ||| 0-0 1-0 2-0 3-0 4-0 ||| 476773 1.98211e+07 89
Sorting multiple keys with Unix sort
Use the -k
option (or --key=POS1[,POS2]
). It can appear multiple times and each key can have global options (such as n
for numeric sort)
How to sort a file, based on its numerical values for a field?
Take a peek at the man page for sort...
-n, --numeric-sort
compare according to string numerical value
So here is an example...
sort -n filename
Shell script - Sorting text file by column that will & will not exist
Assuming there won't be embedded commas before the sort field, you could use awk
to help with the sort and remove any extraneous information afterward:
awk -F',' -vi=0 '
/^INSERT/ { i=$4 }
{ printf("%08d%08d\t%s\n", i, NR, $0) }
' < data | sort | sed -e 's/^[^\t]*\t//' > newdata
That just feeds an input file "data" to awk
, which will insert a formatted copy of the 4th field before the data itself in the form XXXXXXXX.NNNNNNNN
, where XXXXXXXX
represents an 8-digit representation of the value in the fourth field of the last INSERT ...
line found, and NNNNNNNN
represents the record number (line number) formatted as an 8-digit value. The original data and the formatted data is separated by a tab character. A special case in which the file doesn't start with INSERT
(e.g. a blank line, a comment explaining the contents of the file, etc.) is treated as if the last INSERT
line had a fourth field with a value of 0. Here's a clipped sample of the resulting output with some additional lines inserted for testing:
0000000000000001 # This file contains data.
0000001600000002 INSERT,SLT_TEST_5,1218738496,16,DEBUG3,,DEBUG_LEVEL1
0000001600000003 <v s=""MONTHLY_PEAK_DWNLOAD""/>
0000001600000004 </a><a n=""thresholdScheme"">
0000001600000005 <o t=""PM_UsageMonitorConfigThreshold"">
0000001800000006 INSERT,SLT_TEST_1,1218738496,18,DEBUG3,,DEBUG_LEVEL4
0000001800000007 <v s=""ORANGE""/>
0000000500000008 INSERT,SLT_TEST_3,5555738111,5,DEBUG3,,DEBUG_LEVEL1
0000000700000009 INSERT,SLT_TEST_1,9998738342,7,DEBUG3,,DEBUG_LEVEL2
0000000700000010 I'm a little teapot.
The resulting lines are passed to sort
, which will sort the lines it receives. sed
will then strip the lines that the sort
utility outputs of the information the awk
script originally created, resulting in a sort based on the fourth field without altering the order of the lines that don't begin with INSERT
, which is output to a file "newdata":
# This file contains data.
INSERT,SLT_TEST_3,5555738111,5,DEBUG3,,DEBUG_LEVEL1
INSERT,SLT_TEST_1,9998738342,7,DEBUG3,,DEBUG_LEVEL2
I'm a little teapot.
INSERT,SLT_TEST_5,1218738496,16,DEBUG3,,DEBUG_LEVEL1
<v s=""MONTHLY_PEAK_DWNLOAD""/>
</a><a n=""thresholdScheme"">
<o t=""PM_UsageMonitorConfigThreshold"">
INSERT,SLT_TEST_1,1218738496,18,DEBUG3,,DEBUG_LEVEL4
<v s=""ORANGE""/>
Related Topics
For Loop in Bash Simply Prints N Times the Command Instead of Reiterating
Listen on a Network Port and Save Data to a Text File
How to Count Number of Unique Values of a Field in a Tab-Delimited Text File
How to Speed Up Linux Kernel Compilation
How to Monitor Data on a Serial Port in Linux
How to Add My Own Software to a Buildroot Linux Package
In Linux, What Do All the Values in the "Top" Command Mean
Importing a Cmake Project into Eclipse Cdt
Linux: Merging Multiple Files, Each on a New Line
Compress Files While Reading Data from Stdin
How to Check the Bios Version or Name in Linux Through a Command Prompt
Linux Script with Curl to Check Webservice Is Up
Linux: How to Know the Module That Exports a Device Node
How to Load Luks Passphrase from Usb, Falling Back to Keyboard
How to Hide Wget Output in Linux
Every Command Is Returning 'Bash: <Command>: Command Not Found...'