sort not sorting as expected (space and locale)
It uses the system locale to determine the sorting order of letters. My guess is that with your locale, it ignores whitespace.
$ cat foo.txt
v 1006
v10 1
v 1011
$ LC_ALL=C sort foo.txt
v 1006
v 1011
v10 1
$ LC_ALL=en_US.utf8 sort foo.txt
v 1006
v10 1
v 1011
Bash : sort command do not treat dots
When sorting, your current locale is influencing the order. If you want locale independent order, use the C locale:
IFS=$'\n'; echo "${a[*]}" | LC_ALL=C sort -d; unset IFS
Setting LC_COLLATE
should be enough, in fact.
Why does the sort command sort differently if there are trailing fields?
The man page for my version of sort
says:
*** WARNING *** The locale specified by the environment affects sort order.
Set LC_ALL=C to get the traditional sort order that uses native byte values.
And indeed, if I set LC_ALL=C
and run sort
on your second example, I get:
$ LC_ALL=C sort < tosort
a 12
a01 7
a02 42
Your default locate is probably something other than C
.
SciTE sort selection tool : numbers with leading spaces are not sorted as expected
Your expectation is wrong. You said the algorithm is supposed to sort the texts alphabetically and that is exactly what it does.
For Lua "11" is smaller than "2".
I think you would agree that "aa" should come befor "b" which is pretty much the same thing.
If you want to change how texts are sorted you have to provide your own function.
The Lua reference manual says:
table.sort (list [, comp])
Sorts list elements in a given order, in-place, from list[1] to
list[#list]. If comp is given, then it must be a function that
receives two list elements and returns true when the first element
must come before the second in the final order (so that, after the
sort, i < j implies not comp(list[j],list[i])). If comp is not given,
then the standard Lua operator < is used instead.Note that the comp function must define a strict partial order over
the elements in the list; that is, it must be asymmetric and
transitive. Otherwise, no valid sort may be possible.The sort algorithm is not stable: elements considered equal by the
given order may have their relative positions changed by the sort.
So you are free to implement your own comp function to change the sorting.
By default table.sort(list)
sort list in ascending order.
To make it sort in descending order you call:
table.sort(list, function(a,b) return a > b end)
If you want to treat numbers differently you can do something like this:
t = {"111", "11", "3", "2", "a", "b"}
local function myCompare(a,b)
local a_number = tonumber(a)
local b_number = tonumber(b)
if a_number and b_number then
return a_number < b_number
end
end
table.sort(t, myCompare)
for i,v in ipairs(t) do
print(v)
end
which would give you the output
2
3
11
111
a
b
Of course this is just a quick and simple example. A nicer implementation is up to you.
UNIX sort ignores whitespaces
Solved by:
export LC_ALL=C
From the sort()
documentation:
WARNING: The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.
(works for ASCII at least, no idea for UTF8)
Related Topics
Getting CPU Cycles Using Rdtsc - Why Does the Value of Rdtsc Always Increase
Joining Multiple Fields in Text Files on Unix
How to Determine If a Detached Pthread Is Alive
Search and Replace with Sed When Dots and Underscores Are Present
How to Set Rpath and Runpath with Gcc/Ld
Install Mono and Monodevelop on Centos 5.X/6.X
How Pthread_Mutex_Lock Is Implemented
How Find Out Which Process Is Using a File in Linux
How to Kill All Linux Processes That Are Older Than a Certain Age
How to Run a Windows Executable from Wsl (Ubuntu) Bash
Maximum Number of Concurrent Connections on a Single Port (Socket) of Server
Creating Executable Files in Linux