Std::List<>::Sort()' - Why the Sudden Switch to Top-Down Strategy

`std::list ::sort()` - why the sudden switch to top-down strategy?

Note this answer has been updated to address all of the issues mentioned in the comments below and after the question, by making the same change from an array of lists to an array of iterators, while retaining the faster bottom up merge sort algorithm, and eliminating the small chance of stack overflow due to recursion with the top down merge sort algorithm.

The reason I didn't originally consider iterators was due to the VS2015 change to top down, leading me to believe there was some issue with trying to change the existing bottom up algorithm to use iterators, requiring a switch to the slower top down algorithm. It was only when I tried to analyze the switch to iterators myself that I realized there was a solution for bottom up algorithm.

In @sbi's comment, he asked the author of the top down approach, Stephan T. Lavavej, why the change was made. Stephan's response was "to avoid memory allocation and default constructing allocators". VS2015 introduced non-default-constructible and stateful allocators, which presents an issue when using the prior version's array of lists, as each instance of a list allocates a dummy node, and a change would be needed to handle no default allocator.

Lavavej's solution was to switch to using iterators to keep track of run boundaries within the original list instead of an internal array of lists. The merge logic was changed to use 3 iterator parameters, 1st parameter is iterator to start of left run, 2nd parameter is iterator to end of left run == iterator to start of right run, 3rd parameter is iterator to end of right run. The merge process uses std::list::splice to move nodes within the original list during merge operations. This has the added benefit of being exception safe. If a caller's compare function throws an exception, the list will be re-ordered, but no loss of data will occur (assuming splice can't fail). With the prior scheme, some (or most) of the data would be in the internal array of lists if an exception occurred, and data would be lost from the original list.

However the switch to top down merge sort was not needed. Initially, thinking there was some unknown to me reason for VS2015 switch to top down, I focused on using the internal interfaces in the same manner as std::list::splice. I later decided to investigate switching bottom up to use an array of iterators. I realized the order of runs stored in the internal array was newest (array[0] = rightmost) to oldest (array[last] = leftmost), and that it could use the same iterator based merge logic as VS2015's top down approach.

For bottom up merge sort, array[i] is an iterator to the start of a sorted sub-list with 2^i nodes, or it is empty (using std::list::end to indicate empty). The end of each sorted sub-list will be the start of a sorted sub-list in the next prior non-empty entry in the array, or if at the start of the array, in a local iterator (it points to end of newest run). Similar to the top down approach, the array of iterators is only used to keep track of sorted run boundaries within the original linked list, while the merge process uses std::list::splice to move nodes within the original linked list.

If a linked list is large and the nodes scattered, there will be a lot of cache misses. Bottom up will be about 30% faster than top down (equivalent to stating top down is about 42% slower than bottom up ). Then again, if there's enough memory, it would usually be faster to move the list to an array or vector, sort the array or vector, then create a new list from the sorted array or vector.

Example C++ code:

#define ASZ 32

template <typename T>
void SortList(std::list<T> &ll)
{
    if (ll.size() < 2)                  // return if nothing to do
        return;
    std::list<T>::iterator ai[ASZ];     // array of iterators
    std::list<T>::iterator mi;          // middle iterator (end lft, bgn rgt)
    std::list<T>::iterator ei;          // end    iterator
    size_t i;
    for (i = 0; i < ASZ; i++)           // "clear" array
        ai[i] = ll.end();
    // merge nodes into array
    for (ei = ll.begin(); ei != ll.end();) {
        mi = ei++;
        for (i = 0; (i < ASZ) && ai[i] != ll.end(); i++) {
            mi = Merge(ll, ai[i], mi, ei);
            ai[i] = ll.end();
        }
        if(i == ASZ)
            i--;
        ai[i] = mi;
    }
    // merge array into single list
    ei = ll.end();                              
    for(i = 0; (i < ASZ) && ai[i] == ei; i++);
    mi = ai[i++];
    while(1){
        for( ; (i < ASZ) && ai[i] == ei; i++);
        if (i == ASZ)
            break;
        mi = Merge(ll, ai[i++], mi, ei);
    }
}

template <typename T>
typename std::list<T>::iterator Merge(std::list<T> &ll,
                             typename std::list<T>::iterator li,
                             typename std::list<T>::iterator mi,
                             typename std::list<T>::iterator ei)
{
    std::list<T>::iterator ni;
    (*mi < *li) ? ni = mi : ni = li;
    while(1){
        if(*mi < *li){
            ll.splice(li, ll, mi++);
            if(mi == ei)
                return ni;
        } else {
            if(++li == mi)
                return ni;
        }
    }
}

Example replacement code for VS2019's std::list::sort() (the merge logic was made into a separate internal function, since it's now used in two places).

private:
    template <class _Pr2>
    iterator _Merge(_Pr2 _Pred, iterator _First, iterator _Mid, iterator _Last){
        iterator _Newfirst = _First;
        for (bool _Initial_loop = true;;
            _Initial_loop       = false) { // [_First, _Mid) and [_Mid, _Last) are sorted and non-empty
            if (_DEBUG_LT_PRED(_Pred, *_Mid, *_First)) { // consume _Mid
                if (_Initial_loop) {
                    _Newfirst = _Mid; // update return value
                }
                splice(_First, *this, _Mid++);
                if (_Mid == _Last) {
                    return _Newfirst; // exhausted [_Mid, _Last); done
                }
            }
            else { // consume _First
                ++_First;
                if (_First == _Mid) {
                    return _Newfirst; // exhausted [_First, _Mid); done
                }
            }
        }
    }

    template <class _Pr2>
    void _Sort(iterator _First, iterator _Last, _Pr2 _Pred,
        size_type _Size) { // order [_First, _Last), using _Pred, return new first
                           // _Size must be distance from _First to _Last
        if (_Size < 2) {
            return;        // nothing to do
        }
        const size_t _ASZ = 32;         // array size
        iterator _Ai[_ASZ];             // array of   iterators to runs
        iterator _Mi;                   // middle     iterator
        iterator _Li;                   // last (end) iterator
        size_t _I;                      // index to _Ai
        for (_I = 0; _I < _ASZ; _I++)   // "empty" array
            _Ai[_I] = _Last;            //   _Ai[] == _Last => empty entry
        // merge nodes into array
        for (_Li = _First; _Li != _Last;) {
            _Mi = _Li++;
            for (_I = 0; (_I < _ASZ) && _Ai[_I] != _Last; _I++) {
                _Mi = _Merge(_Pass_fn(_Pred), _Ai[_I], _Mi, _Li);
                _Ai[_I] = _Last;
            }
            if (_I == _ASZ)
                _I--;
            _Ai[_I] = _Mi;
        }
        // merge array runs into single run
        for (_I = 0; _I < _ASZ && _Ai[_I] == _Last; _I++);
        _Mi = _Ai[_I++];
        while (1) {
            for (; _I < _ASZ && _Ai[_I] == _Last; _I++);
            if (_I == _ASZ)
                break;
            _Mi = _Merge(_Pass_fn(_Pred), _Ai[_I++], _Mi, _Last);
        }
    }

The remainder of this answer is historical, and only left for the historical comments, otherwise it is no longer relevant.

I was able to reproduce the issue (old sort fails to compile, new one works) based on a demo from @IgorTandetnik:

#include <iostream>
#include <list>
#include <memory>

template <typename T>
class MyAlloc : public std::allocator<T> {
public:
    MyAlloc(T) {}  // suppress default constructor
    
    template <typename U>
    MyAlloc(const MyAlloc<U>& other) : std::allocator<T>(other) {}
    
    template< class U > struct rebind { typedef MyAlloc<U> other; };
};

int main()
{
    std::list<int, MyAlloc<int>> l(MyAlloc<int>(0));
    l.push_back(3);
    l.push_back(0);
    l.push_back(2);
    l.push_back(1);
    l.sort();
    return 0;
}

I noticed this change back in July, 2016 and emailed P.J. Plauger about this change on August 1, 2016. A snippet of his reply:

Interestingly enough, our change log doesn't reflect this change. That
probably means it was "suggested" by one of our larger customers and
got by me on the code review. All I know now is that the change came
in around the autumn of 2015. When I reviewed the code, the first
thing that struck me was the line:
    iterator _Mid = _STD next(_First, _Size / 2);
which, of course, can take a very long time for a large list.
The code looks a bit more elegant than what I wrote in early 1995(!),
but definitely has worse time complexity. That version was modeled
after the approach by Stepanov, Lee, and Musser in the original STL.
They are seldom found to be wrong in their choice of algorithms.
I'm now reverting to our latest known good version of the original code.

I don't know if P.J. Plauger's reversion to the original code dealt with the new allocator issue, or if or how Microsoft interacts with Dinkumware.

For a comparison of the top down versus bottom up methods, I created a linked list with 4 million elements, each consisting of one 64 bit unsigned integer, assuming I would end up with a doubly linked list of nearly sequentially ordered nodes (even though they would be dynamically allocated), filled them with random numbers, then sorted them. The nodes don't move, only the linkage is changed, but now traversing the list accesses the nodes in random order. I then filled those randomly ordered nodes with another set of random numbers and sorted them again. I compared the 2015 top down approach with the prior bottom up approach modified to match the other changes made for 2015 (sort() now calls sort() with a predicate compare function, rather than having two separate functions). These are the results. update - I added a node pointer based version and also noted the time for simply creating a vector from list, sorting vector, copy back.

sequential nodes: 2015 version 1.6 seconds, prior version 1.5  seconds
random nodes:     2015 version 4.0 seconds, prior version 2.8  seconds
random nodes:                  node pointer based version 2.6  seconds
random nodes:    create vector from list, sort, copy back 1.25 seconds

For sequential nodes, the prior version is only a bit faster, but for random nodes, the prior version is 30% faster, and the node pointer version 35% faster, and creating a vector from the list, sorting the vector, then copying back is 69% faster.

Below is the first replacement code for std::list::sort() I used to compare the prior bottom up with small array (_BinList[]) method versus VS2015's top down approach I wanted the comparison to be fair, so I modified a copy of < list >.

    void sort()
        {   // order sequence, using operator<
        sort(less<>());
        }

    template<class _Pr2>
        void sort(_Pr2 _Pred)
        {   // order sequence, using _Pred
        if (2 > this->_Mysize())
            return;
        const size_t _MAXBINS = 25;
        _Myt _Templist, _Binlist[_MAXBINS];
        while (!empty())
            {
            // _Templist = next element
            _Templist._Splice_same(_Templist.begin(), *this, begin(),
                ++begin(), 1);
            // merge with array of ever larger bins
            size_t _Bin;
            for (_Bin = 0; _Bin < _MAXBINS && !_Binlist[_Bin].empty();
                ++_Bin)
                _Templist.merge(_Binlist[_Bin], _Pred);
            // don't go past end of array
            if (_Bin == _MAXBINS)
                _Bin--;
            // update bin with merged list, empty _Templist
            _Binlist[_Bin].swap(_Templist);
            }
            // merge bins back into caller's list
            for (size_t _Bin = 0; _Bin < _MAXBINS; _Bin++)
                if(!_Binlist[_Bin].empty())
                    this->merge(_Binlist[_Bin], _Pred);
        }

I made some minor changes. The original code kept track of the actual maximum bin in a variable named _Maxbin, but the overhead in the final merge is small enough that I removed the code associated with _Maxbin. During the array build, the original code's inner loop merged into a _Binlist[] element, followed by a swap into _Templist, which seemed pointless. I changed the inner loop to just merge into _Templist, only swapping once an empty _Binlist[] element is found.

Below is a node pointer based replacement for std::list::sort() I used for yet another comparison. This eliminates allocation related issues. If a compare exception is possible and occurred, all the nodes in the array and temp list (pNode) would have to be appended back to the original list, or possibly a compare exception could be treated as a less than compare.

    void sort()
        {   // order sequence, using operator<
        sort(less<>());
        }

    template<class _Pr2>
        void sort(_Pr2 _Pred)
        {   // order sequence, using _Pred
        const size_t _NUMBINS = 25;
        _Nodeptr aList[_NUMBINS];           // array of lists
        _Nodeptr pNode;
        _Nodeptr pNext;
        _Nodeptr pPrev;
        if (this->size() < 2)               // return if nothing to do
            return;
        this->_Myhead()->_Prev->_Next = 0;  // set last node ->_Next = 0
        pNode = this->_Myhead()->_Next;     // set ptr to start of list
        size_t i;
        for (i = 0; i < _NUMBINS; i++)      // zero array
            aList[i] = 0;
        while (pNode != 0)                  // merge nodes into array
            {
            pNext = pNode->_Next;
            pNode->_Next = 0;
            for (i = 0; (i < _NUMBINS) && (aList[i] != 0); i++)
                {
                pNode = _MergeN(_Pred, aList[i], pNode);
                aList[i] = 0;
                }
            if (i == _NUMBINS)
                i--;
            aList[i] = pNode;
            pNode = pNext;
            }
        pNode = 0;                          // merge array into one list
        for (i = 0; i < _NUMBINS; i++)
            pNode = _MergeN(_Pred, aList[i], pNode);
        this->_Myhead()->_Next = pNode;     // update sentinel node links
        pPrev = this->_Myhead();            //  and _Prev pointers
        while (pNode)
            {
            pNode->_Prev = pPrev;
            pPrev = pNode;
            pNode = pNode->_Next;
            }
        pPrev->_Next = this->_Myhead();
        this->_Myhead()->_Prev = pPrev;
        }

    template<class _Pr2>
        _Nodeptr _MergeN(_Pr2 &_Pred, _Nodeptr pSrc1, _Nodeptr pSrc2)
        {
        _Nodeptr pDst = 0;          // destination head ptr
        _Nodeptr *ppDst = &pDst;    // ptr to head or prev->_Next
        if (pSrc1 == 0)
            return pSrc2;
        if (pSrc2 == 0)
            return pSrc1;
        while (1)
            {
            if (_DEBUG_LT_PRED(_Pred, pSrc2->_Myval, pSrc1->_Myval))
                {
                *ppDst = pSrc2;
                pSrc2 = *(ppDst = &pSrc2->_Next);
                if (pSrc2 == 0)
                    {
                    *ppDst = pSrc1;
                    break;
                    }
                }
            else
                {
                *ppDst = pSrc1;
                pSrc1 = *(ppDst = &pSrc1->_Next);
                if (pSrc1 == 0)
                    {
                    *ppDst = pSrc2;
                    break;
                    }
                }
            }
        return pDst;
        }

std::list sort algorithm runtime

If every element is within k positions of its proper place, then insertion sort will take less than kN comparisons and swaps/moves. It's also very easy to implement.

Compare this to the N*log(N) operations required by merge sort or quick sort to see if that will work better for you.

Searching Mergesort algorithm using Java's standard LinkedList

The Collections.sort() method sorts the LinkedList. It does so by copying the contents of LinkedList to an array, sorting the array, and copying the contents of array back to LinkedList. You can implement the same functionalities as the Collections.sort() method and use the merge sort algorithm to sort the array.

Below I have detailed my steps and correct code that sorts Java's standard LinkedList using Merge Sort algorithm.

Steps:

Create a LinkedList 'orginialList'
Convert 'originalList' to Integer array 'arr'
Convert 'arr' to int array 'intArray'
Sort 'intArray' using the merge sort algorithm
Convert sorted 'intArray' back to Integer array 'arr' (use for loop to change
elements in 'arr')
Create a LinkedList 'newList'
Add elements of 'arr' to 'newList' (use Arrays.asList(arr))

     public static class CollectionsSort {
         public static void main(String[] args) {
               LinkedList<Integer> originalList = new LinkedList<>();
               originalList.add(3);
               originalList.add(2);
               originalList.add(1);

               Integer[] arr = list.toArray(new Integer[list.size()]);
               int[] intArray = new int[arr.length];
               for (int i = 0; i < intArray.length; i++) {
                   intArray[i] = arr[i].intValue();
               }
               mergesort(intArray);
               for (int i = 0; i < arr.length; i++) {
                   arr[i] = new Integer(intArray[i]); 
               }
               LinkedList<Integer> newList = new LinkedList(Arrays.asList(arr));
               Iterator it = newList.iterator();
               while(it.hasNext()) {
                  System.out.print((int)(it.next()) + " ");
               }
         }

         public static int[] mergesort(int[] arr) {
               int low = 0;
               int high = arr.length-1;
               mergesort(arr, low, high);
               return arr;
         }

         public static void mergesort(int[] arr, int low, int high) {
               if (low == high) {
                   return;
               } else {
                   int middle = low + (high-low)/2;
                   mergesort(arr, low, middle);
                   mergesort(arr, middle+1, high);
                   merge(arr, low, middle, high);
               }
         }

         public static void merge(int[] arr, int low, int mid, int high) {
               int[] left = new int[mid-low+2];
               for (int i = low; i <= mid; i++) {
                   left[i-low] = arr[i];
               }
               left[mid-low+1] = Integer.MAX_VALUE;
               int[] right = new int[high-mid+1];
               for(int i = mid+1; i <= high; i++) {
                   right[i-mid-1] = arr[i];
               }
               right[high-mid] = Integer.MAX_VALUE;
               int i = 0, j = 0;
               for(int k = low; k <= high; k++) {
                   if(left[i] <= right[j]) {
                      arr[k] = left[i];
                      i++;
                   } else {
                      arr[k] = right[j];
                      j++;
                   }
               }
         }
     }

What makes the gcc std::list sort implementation so fast?

I've been taking a look at the interesting GLibC implementation for list::sort (source code) and it doesn't seem to implement a traditional merge sort algorithm (at least not one I've ever seen before).

Basically what it does is:

Creates a series of buckets (64 total).
Removes the first element of the list to sort and merges it with the first (i=0th) bucket.
If, before the merge, the ith bucket is not empty, merge the ith bucket with the i+1th bucket.
Repeat step 3 until we merge with an empty bucket.
Repeat step 2 and 3 until the list to sort is empty.
Merge all the remaining non-empty buckets together starting from smallest to largest.

Small note: merging a bucket X with a bucket Y will remove all the elements from bucket X and add them to bucket Y while keeping everything sorted. Also note that the number of elements within a bucket is either 0 or 2^i.

Now why is this faster then a traditionnal merge sort? Well I can't say for sure but here are a few things that comes to mind:

It never traverses the list to find a mid-point which also makes the algorithm more cache friendly.
Because the earlier buckets are small and used more frequently, the calls to merge trash the cache less frequently.
The compiler is able to optimize this implementation better. Would need to compare the generated assembly to be sure about this.

I'm pretty sure the folks who implemented this algorithm tested it thoroughly so if you want a definitive answer you'll probably have to ask on the GCC mailing list.

Interruptible sort function

Have the comparison function check an atomic flag and throw an exception if the flag is set. The sorting thread should catch the exception and exit cleanly. The GUI thread then just needs to set the flag.

PHP parse/syntax errors; and how to solve them

What are the syntax errors?

PHP belongs to the C-style and imperative programming languages. It has rigid grammar rules, which it cannot recover from when encountering misplaced symbols or identifiers. It can't guess your coding intentions.

Function definition syntax abstract

Most important tips

There are a few basic precautions you can always take:

Use proper code indentation, or adopt any lofty coding style.
Readability prevents irregularities.
Use an IDE or editor for PHP with syntax highlighting.
Which also help with parentheses/bracket balancing.
Read the language reference and examples in the manual.
Twice, to become somewhat proficient.

How to interpret parser errors

A typical syntax error message reads:

Parse error: syntax error, unexpected T_STRING, expecting ';' in file.php on line 217

Which lists the possible location of a syntax mistake. See the mentioned file name and line number.

A moniker such as T_STRING explains which symbol the parser/tokenizer couldn't process finally. This isn't necessarily the cause of the syntax mistake, however.

It's important to look into previous code lines as well. Often syntax errors are just mishaps that happened earlier. The error line number is just where the parser conclusively gave up to process it all.

Solving syntax errors

There are many approaches to narrow down and fix syntax hiccups.

Open the mentioned source file. Look at the mentioned code line.
- For runaway strings and misplaced operators, this is usually where you find the culprit.
- Read the line left to right and imagine what each symbol does.
More regularly you need to look at preceding lines as well.
- In particular, missing ; semicolons are missing at the previous line ends/statement. (At least from the stylistic viewpoint. )
- If { code blocks } are incorrectly closed or nested, you may need to investigate even further up the source code. Use proper code indentation to simplify that.
Look at the syntax colorization!
- Strings and variables and constants should all have different colors.
- Operators +-*/. should be tinted distinct as well. Else they might be in the wrong context.
- If you see string colorization extend too far or too short, then you have found an unescaped or missing closing " or ' string marker.
- Having two same-colored punctuation characters next to each other can also mean trouble. Usually, operators are lone if it's not ++, --, or parentheses following an operator. Two strings/identifiers directly following each other are incorrect in most contexts.
Whitespace is your friend.
Follow any coding style.
Break up long lines temporarily.
- You can freely add newlines between operators or constants and strings. The parser will then concretize the line number for parsing errors. Instead of looking at the very lengthy code, you can isolate the missing or misplaced syntax symbol.
- Split up complex if statements into distinct or nested if conditions.
- Instead of lengthy math formulas or logic chains, use temporary variables to simplify the code. (More readable = fewer errors.)
- Add newlines between:
  1. The code you can easily identify as correct,
  2. The parts you're unsure about,
  3. And the lines which the parser complains about.
  Partitioning up long code blocks really helps to locate the origin of syntax errors.
Comment out offending code.
- If you can't isolate the problem source, start to comment out (and thus temporarily remove) blocks of code.
- As soon as you got rid of the parsing error, you have found the problem source. Look more closely there.
- Sometimes you want to temporarily remove complete function/method blocks. (In case of unmatched curly braces and wrongly indented code.)
- When you can't resolve the syntax issue, try to rewrite the commented out sections from scratch.
As a newcomer, avoid some of the confusing syntax constructs.
- The ternary ? : condition operator can compact code and is useful indeed. But it doesn't aid readability in all cases. Prefer plain if statements while unversed.
- PHP's alternative syntax (if:/elseif:/endif;) is common for templates, but arguably less easy to follow than normal { code } blocks.
The most prevalent newcomer mistakes are:
- Missing semicolons ; for terminating statements/lines.
- Mismatched string quotes for " or ' and unescaped quotes within.
- Forgotten operators, in particular for the string . concatenation.
- Unbalanced ( parentheses ). Count them in the reported line. Are there an equal number of them?
Don't forget that solving one syntax problem can uncover the next.
- If you make one issue go away, but other crops up in some code below, you're mostly on the right path.
- If after editing a new syntax error crops up in the same line, then your attempted change was possibly a failure. (Not always though.)
Restore a backup of previously working code, if you can't fix it.
- Adopt a source code versioning system. You can always view a diff of the broken and last working version. Which might be enlightening as to what the syntax problem is.
Invisible stray Unicode characters: In some cases, you need to use a hexeditor or different editor/viewer on your source. Some problems cannot be found just from looking at your code.
- Try grep --color -P -n "\[\x80-\xFF\]" file.php as the first measure to find non-ASCII symbols.
- In particular BOMs, zero-width spaces, or non-breaking spaces, and smart quotes regularly can find their way into the source code.
Take care of which type of linebreaks are saved in files.
- PHP just honors \n newlines, not \r carriage returns.
- Which is occasionally an issue for MacOS users (even on OS X for misconfigured editors).
- It often only surfaces as an issue when single-line // or # comments are used. Multiline /*...*/ comments do seldom disturb the parser when linebreaks get ignored.
If your syntax error does not transmit over the web:
It happens that you have a syntax error on your machine. But posting the very same file online does not exhibit it anymore. Which can only mean one of two things:
- You are looking at the wrong file!
- Or your code contained invisible stray Unicode (see above).
  You can easily find out: Just copy your code back from the web form into your text editor.
Check your PHP version. Not all syntax constructs are available on every server.
- php -v for the command line interpreter
- <?php phpinfo(); for the one invoked through the webserver.
Those aren't necessarily the same. In particular when working with frameworks, you will them to match up.
Don't use PHP's reserved keywords as identifiers for functions/methods, classes or constants.
Trial-and-error is your last resort.

Std::List<>::Sort()' - Why the Sudden Switch to Top-Down Strategy