Sorting in bash
Use:
cut -f <col_num> <filename>
| sort
| uniq -c
| sort -r -k1 -n
| awk '{print $2" "$1}'
The sort -r -k1 -n
sorts in reverse order, using the first field as a numeric value. The awk
simply reverses the order of the columns. You can test the added pipeline commands thus (with nicer formatting):
pax> echo '105 Linux
55 MacOS
500 Windows' | sort -r -k1 -n | awk '{printf "%-10s %5d\n",$2,$1}'
Windows 500
Linux 105
MacOS 55
Terminal: SORT command; how to sort correctly?
If I understand the problem correctly, you want the "natural sort order" as described in Natural sort order - Wikipedia, Sorting for Humans : Natural Sort Order, and macos - How does finder sort folders when they contain digits and characters?.
Using Linux sort(1) you need the -V
(--version-sort
) option for "natural" sort. You also need the -f
(--ignore-case
) option to disregard the case of letters. So, assuming that the file names are stored one-per-line in a file called files.txt
you can produce a list (mostly) sorted in the way that you want with:
sort -Vf files.txt
However, sort -Vf
sorts underscores after digits and letters on my system. I've tried using different locales (see How to set locale in the current terminal's session?), but with no success. I can't see a way to change this with sort
options (but I may be missing something).
The characters .
and ~
seem to consistently sort before numbers and letters with sort -V
. A possible hack to work around the problem is to swap underscore with one of them, sort, and then swap again. For example:
tr '_~' '~_' <files.txt | LC_ALL=C sort -Vf | tr '_~' '~_'
seems to do what you want on my system. I've explicitly set the locale for the sort
command with LC_ALL=C ...
so it should behave the same on other systems. (See Why doesn't sort sort the same on every machine?.)
sort by datetime format in bash
Sort is able to deal with month names, thanks to the option M
No need to change ,
into !
. Use the white space as delimiter and just issue:
LC_ALL=en sort -k7nr -k5Mr -k6nr -k2r sample
If you use this as content of the file sample
:
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Apr 1 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Mar 17 2021, foo=moshe2, bar=haim2
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Mar 16 2021, foo=moshe, bar=haim
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Feb 28 2021, foo=moshe, bar=haim
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2020, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2021, foo=moshe2, bar=haim2
you will get this as output:
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Apr 1 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Mar 17 2021, foo=moshe2, bar=haim2
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Mar 16 2021, foo=moshe, bar=haim
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Feb 28 2021, foo=moshe, bar=haim
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2020, foo=moshe2, bar=haim2
Specifying -k7
means to sort on the seventh field. The r
option reverses the order of sorting to descending. The M
option sorts according the name of the month. The n
option sorts numerically. To sort on the time, just consider the whole second field (beginning with the string setup_time=
) as a fixed length string using -k2
.
LC_ALL=en
in the begin of the command line tells the system to use the English names of the months.
How to sort an array in Bash
You don't really need all that much code:
IFS=$'\n' sorted=($(sort <<<"${array[*]}"))
unset IFS
Supports whitespace in elements (as long as it's not a newline), and works in Bash 3.x.
e.g.:
$ array=("a c" b f "3 5")
$ IFS=$'\n' sorted=($(sort <<<"${array[*]}")); unset IFS
$ printf "[%s]\n" "${sorted[@]}"
[3 5]
[a c]
[b]
[f]
Note: @sorontar has pointed out that care is required if elements contain wildcards such as *
or ?
:
The sorted=($(...)) part is using the "split and glob" operator. You should turn glob off:
set -f
orset -o noglob
orshopt -op noglob
or an element of the array like*
will be expanded to a list of files.
What's happening:
The result is a culmination six things that happen in this order:
IFS=$'\n'
"${array[*]}"
<<<
sort
sorted=($(...))
unset IFS
First, the IFS=$'\n'
This is an important part of our operation that affects the outcome of 2 and 5 in the following way:
Given:
"${array[*]}"
expands to every element delimited by the first character ofIFS
sorted=()
creates elements by splitting on every character ofIFS
IFS=$'\n'
sets things up so that elements are expanded using a new line as the delimiter, and then later created in a way that each line becomes an element. (i.e. Splitting on a new line.)
Delimiting by a new line is important because that's how sort
operates (sorting per line). Splitting by only a new line is not-as-important, but is needed preserve elements that contain spaces or tabs.
The default value of IFS
is a space, a tab, followed by a new line, and would be unfit for our operation.
Next, the sort <<<"${array[*]}"
part
<<<
, called here strings, takes the expansion of "${array[*]}"
, as explained above, and feeds it into the standard input of sort
.
With our example, sort
is fed this following string:
a c
b
f
3 5
Since sort
sorts, it produces:
3 5
a c
b
f
Next, the sorted=($(...))
part
The $(...)
part, called command substitution, causes its content (sort <<<"${array[*]}
) to run as a normal command, while taking the resulting standard output as the literal that goes where ever $(...)
was.
In our example, this produces something similar to simply writing:
sorted=(3 5
a c
b
f
)
sorted
then becomes an array that's created by splitting this literal on every new line.
Finally, the unset IFS
This resets the value of IFS
to the default value, and is just good practice.
It's to ensure we don't cause trouble with anything that relies on IFS
later in our script. (Otherwise we'd need to remember that we've switched things around--something that might be impractical for complex scripts.)
bash: sort applied to a file returns right results as terminal output, but does change the file itself
SOLVED
From this thread it turns out that redirecting the output of sort into the same file from which sort reads as source will not work since
the shell is makes the redirections (not the sort(1) program) and the
input file (as being the output also) will be erased just before
giving the sort(1) program the opportunity of reading it.
So I have split my command into two
sort -k1 -n source-g5.txt > tmp-source-g5.txt
mv tmp-source-g5.txt > source-g5.txt
Bash : sort command do not treat dots
When sorting, your current locale is influencing the order. If you want locale independent order, use the C locale:
IFS=$'\n'; echo "${a[*]}" | LC_ALL=C sort -d; unset IFS
Setting LC_COLLATE
should be enough, in fact.
Sorting files in bash
For this dataset, only sort of the first field.
$: printf "%s\n" V0.1__file_a.sql V0.2__file_b.sql V0__file_c.sql | sort -t _ -k 1,1
V0__file_c.sql
V0.1__file_a.sql
V0.2__file_b.sql
Using -k 1,2
fails for me also unless I use a dictionary sort with it (-d
).
$: printf "%s\n" V0.1__file_a.sql V0.2__file_b.sql V0__file_c.sql | sort -t _ -k 1,2
V0.1__file_a.sql
V0.2__file_b.sql
V0__file_c.sql
but works with -d
$: printf "%s\n" V0.1__file_a.sql V0.2__file_b.sql V0__file_c.sql | sort -d -t _ -k 1,2
V0__file_c.sql
V0.1__file_a.sql
V0.2__file_b.sql
Dictionary sort will "consider only blanks and alphanumeric characters", so the dots and underscores are ignored, making all the filenames single strings of alphanumarics, and numbers as characters sort to the top.
-d
alone still fails though - you need to establish fields.
$: printf "%s\n" V0.1__file_a.sql V0.2__file_b.sql V0__file_c.sql | sort -d
V0.1__file_a.sql
V0.2__file_b.sql
V0__file_c.sql
Using -t _
sets underscore as the delimiter, but sort is ignoring it on my implementation as well if I don't explicitly tell it to use a key field.
The combination forces V0
to be compared to V01
and V02
without comparing underscores to dots, so you get the order you wanted.
How to sort data according to the date in bash?
The relevant field be must rendered suitable for sorting, that is, in the form of YYYY-MM-DD, using a utility such as sed
or awk
. For example, with GNU sed
:
sed -E 's/([0-9]{2})-([0-9]{2})-([0-9]{4})/\3-\2-\1/' employees.txt |
sort -r -t'|' -k5,5 | head -n1 | cut -d'|' -f2
Linux bash scripting: sorting a list to use
$@
is an array (of all script parameters), so you can sort using
OIFS="$IFS" # save IFS
IFS=$'\n' sorted=($(sort -n <<<"$*"))
IFS="$OIFS" # restore IFS
and then use the result like so:
for I in "${sorted[@]}"; do
...
done
Explanation:
IFS
is an internal shell variable (internal field separator) which tells the shell which character separates words (default is space, tab and newline).$'\n'
expands to a single newline. When the shell expands$*
, it will now put a new line between each element.sort -n <<<
pipes the "one argument per line" tosort
which sorts numerically (-n
)sorted=($(...))
creates a new array with the result of the command...
See also:
- How to sort an array in BASH
Related Topics
Linux Find Out Hyper-Threaded Core Id
X11: Run a Gnome App as Another User
How to Boot the Linux Kernel Without Creating an Initrd Image
Emacs, Linux and International Keyboard Layouts
Mmap: Will the Mapped File Be Loaded into Memory Immediately
Is There Any Significant Difference Between Tcp_Cork and Tcp_Nodelay in This Use-Case
Bash And/Or .Bashrc Not Working Properly After Su or Ssh Login Unless Run "Bash" Command
What Is File Hole and How Can It Be Used
How to Trace Per-File Io Operations in Linux
How to Get Original Destination Port of Redirected Udp Message
Why Should I Recompile an Entire Program Just for a Library Update
How to Get Eclipse Swt Browser Component Running on Ubuntu 11.04 (Natty Narwhal) with Webkit
Max Thread Per Process in Linux
Packaging Proprietary Software for Linux
How to List the Files in a Zip Archive Without Decompressing It