How to Implement a Natural Sort Algorithm in C++

How to implement a natural sort algorithm in c++?

I asked this exact question (although in Java) and got pointed to http://www.davekoelle.com/alphanum.html which has an algorithm and implementations of it in many languages.

Natural Sort of Directory Filenames in C++

There is a function that does exactly what you want in glibc. Unfortunately it is C, not C++, so if you can live with that here is the simplest possible solution "out of the box", without reimplementing anything and reinventing the wheel. BTW: this is exactly as ls -lv is implemented. The most important part of it is the versionsort function which does the natural sort for you. It is used here as a comparison function for scandir.
The simple example below prints all files/directories in current directory sorted as you wish.

#define _GNU_SOURCE
#include <dirent.h>
#include <stdlib.h>
#include <stdio.h>

int main(void)
{
struct dirent **namelist;
int n,i;

n = scandir(".", &namelist, 0, versionsort);
if (n < 0)
perror("scandir");
else
{
for(i =0 ; i < n; ++i)
{
printf("%s\n", namelist[i]->d_name);
free(namelist[i]);
}
free(namelist);
}
return 0;
}

Natural sort in C - array of strings, containing numbers and letters

I assume you already know the C standard library qsort() function:

void qsort(void *base,
size_t nel,
size_t width,
int (*compar)(const void *, const void *);

That last parameter is a function pointer, which means you can pass any function to it. You could use strcmp(), in fact, but that would give you ASCIIbetical, and you specifically want a natural sort.

In that case, you could write one pretty easily:

#include <ctype.h>

int natural(const char *a, const char *b)
{
if(isalpha(*a) && isalpha(*b))
{
// compare two letters
}
else
{
if(isalpha(*a))
{
// compare a letter to a digit (or other non-letter)
}
else if(isalpha(*b))
{
// compare a digit/non-letter to a letter
}
else
{
// compare two digits/non-letters
}
}
}

Some of the elses could be cleared up if you just return early, but there's a basic structure. Check ctype.h for functions like isalpha() (if a character is part of the alphabet), isdigit(), isspace(), and more.

Natural Sort Order in C#

The easiest thing to do is just P/Invoke the built-in function in Windows, and use it as the comparison function in your IComparer:

[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
private static extern int StrCmpLogicalW(string psz1, string psz2);

Michael Kaplan has some examples of how this function works here, and the changes that were made for Vista to make it work more intuitively. The plus side of this function is that it will have the same behaviour as the version of Windows it runs on, however this does mean that it differs between versions of Windows so you need to consider whether this is a problem for you.

So a complete implementation would be something like:

[SuppressUnmanagedCodeSecurity]
internal static class SafeNativeMethods
{
[DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
public static extern int StrCmpLogicalW(string psz1, string psz2);
}

public sealed class NaturalStringComparer : IComparer<string>
{
public int Compare(string a, string b)
{
return SafeNativeMethods.StrCmpLogicalW(a, b);
}
}

public sealed class NaturalFileInfoNameComparer : IComparer<FileInfo>
{
public int Compare(FileInfo a, FileInfo b)
{
return SafeNativeMethods.StrCmpLogicalW(a.Name, b.Name);
}
}

Using native Windows natural order sorting in a custom program

C++:

#include <windows.h>
#include <shlwapi.h>
#pragma comment(lib, "shlwapi.lib")

#include <algorithm>
#include <vector>
#include <string>
#include <iostream>

bool str_cmp_logical(std::wstring const &lhs, std::wstring const &rhs)
{
return StrCmpLogicalW(lhs.c_str(), rhs.c_str()) < 1;
}

int main()
{
std::vector<std::wstring> foo{
L"20string", L"2string", L"3string", L"st20ring", L"st2ring",
L"st3ring", L"string2", L"string20", L"string3"
};

for (auto const &f : foo)
std::wcout << f << L' ';
std::wcout.put(L'\n');

std::sort(foo.begin(), foo.end(), str_cmp_logical);

for (auto const &f : foo)
std::wcout << f << L' ';
std::wcout.put(L'\n');
}

Output:

20string 2string 3string st20ring st2ring st3ring string2 string20 string3
2string 3string 20string st2ring st3ring st20ring string2 string3 string20

Trying to compile the code with MinGW failed, because the version of <shlwapi.h> that comes with its package w32api doesn't provide a prototype for StrCmpLogicalW(). When I declared it myself i got

C:\MinGW\bin>"g++.exe" -lshlwapi C:\Users\sword\source\repos\Codefun\main.cpp
C:\Users\sword\AppData\Local\Temp\ccMrmLbD.o:main.cpp:(.text+0x23): undefined reference to `StrCmpLogicalW(wchar_t const*, wchar_t const*)'
collect2.exe: error: ld returned 1 exit status

So the libraries shipped with MinGW don't seem to be aware of StrCmpLogicalW().

It should work with Mingw-w64, though.

How to sort filenames with possibly unpadded numbers in c++?

There are already similar questions, I know of Sort on a string that may contain a number and How to implement a natural sort algorithm in C. So you can also look there for more inspiration and help.

Both questions' answers suggest, http://www.davekoelle.com/alphanum.html, which is basically what Pascal Cuoq suggested.

You can also look at the Coding Horror article, where some other algorithms are linked: Sorting for Humans : Natural Sort Order

Sorting List String in C#

How about:

    list.Sort((x, y) =>
{
int ix, iy;
return int.TryParse(x, out ix) && int.TryParse(y, out iy)
? ix.CompareTo(iy) : string.Compare(x, y);
});

Best algorithm to sort the given values

The Google keyword you're looking for is "natural sort".



Related Topics



Leave a reply



Submit