Std::Forward_List and Std::Forward_List::Push_Back

std::forward_list and std::forward_list::push_back

std::forward_list supports fast insertion and removal, but not traversal to the end. To implement .push_back, you'll first need to get to the end of the list, which is O(N) and not fast at all, which is probably why it's not implemented.

You could find the iterator to the last element by incrementing .before_begin N times

auto before_end = slist.before_begin();
for (auto& _ : slist)
  ++ before_end;

and then use .insert_after or .emplace_after to insert the element:

slist.insert_after(before_end, 1234);

std::forward_list - how to insert element at the end

It's a deliberate design decision that forward_list should carry no overhead when compared to a singly-linked list. This is noted in the C++11 standard (23.3.4.1):

Note: It is intended that forward_list have zero space or time overhead relative to a hand-written C-style singly linked list. Features that would conflict with that goal have been omitted.

Maintaining a pointer to the end of the list would add both space overhead (for the pointer itself) and time overhead (updating the pointer when elements are inserted or erased at the end of the list).

std::forward_list::insert_after thread safety

In a shared std::list is it safe for multiple threads to call insert
concurrently if they are guaranteed to never call it with the same
position iterator?

No. It wouldn't be safe...(No matter what).

Inserting into a std::list will require access to the previous node and next node to the iterator position, and the size of the list.

Even if the positions are far off each other, the simple fact that std::list::size() is required to be constant time (C++11). It means each insertion will update std::list::size().

Edit:

In a shared std::forward_list is it safe for multiple threads to
call insert_after concurrently if they are guaranteed to never call it
with the same position iterator?

It's not safe. and not recommended. No STL container was designed with thread safety.

Anyway, lets assume some basic guarantees, lets assume a very simple version of std::forward_list: insert_after modifies the node pointed to by your iterator so that the node now points to the newly inserted node, while the newly inserted node points to the next node. So its only going to be "safe" if the iterators are at least two nodes away from each other and your allocator is thread-safe.

Illustration:

Initial Forward_list
A -> B -> C -> D -> E

Insert K after C
A -> B -> C x D -> E
           \ /
            K

As you can see, C is read and modified. D may be read, while K is inserted.

As to why I said "at least two nodes away", lets take the case of one node away: Assuming two threads want to insert K after C, and M after D respectively:

Initial Forward_list
A -> B -> C -> D -> E

Insert K after C
A -> B -> C x D x E
           \ / \ /
            K   M

From cppreference:

When an evaluation of an expression writes to a memory location and
another evaluation reads or modifies the same memory location, the
expressions are said to conflict.

Move the first element to the end of the forward_list

For your purposes, splice_after needs an iterator to the last element. That is, the element right before end(). There's no cheap way to get this:

auto pos = l2.begin();
while(std::next(pos) != l2.end()) ++pos;

Then, splice_after for a single element asks for an iterator pointing before that element. For the first element, that is before_begin():

l2.splice_after(pos, l2, l2.before_begin());

Difference between list and forward_list performance?

How should we decide which one to used?

Decide if you need bidirectional iteration. If forward iteration is good enough, use std::forward_list, unless you need to support C++ versions older than C++11, which may only have std::list.

Is there any performance benefit of any of the list above other?

std::forward_list eliminates a pointer per node (with all the attendant benefits for the data cache and memory subsystem), while std::list provides constant-time iterator decrement.

But in practice, neither of these containers is as widely used as one might believe when attending computer science school. The real performance of std::vector is superior for many applications, and its memory usage is always less. More demanding applications requiring lists would do well to consider intrusive lists, which standard C++ does not provide.

Why does C++ not provide a First-In-First-Out singly-linked list?

You can easily create one yourself by augmenting forward_list with a before-the-end iterator to implement back() and push_back():

template<class T>
struct fifo_list {
    std::forward_list<T> base;
    std::forward_list<T>::iterator before_end = base.before_begin();
    fifo_list(fifo_list const& other) { for (auto t : other.base) push_back(t); }
    fifo_list(fifo_list&&) = default;
    auto front() { return base.front(); }
    void pop_front() { base.pop_front(); }
    auto back() { return *before_end; }
    void push_back(T t) { before_end = base.insert_after(before_end, t); }
};

This can then be used with the std::queue adapter.

The overhead associated with maintaining the before_end iterator is presumably the reason why this facility (back and push_back) is not included in forward_list already.

Construct new forward_list from iterators into existing forward_list via node splicing

This is a working version based on my original suggestion to use a bunch of single node std::forward_lists. It feels inefficient to make all this std::forward_lists, but from what I can tell, std::forward_list is spec-ed so narrowly that it's almost guaranteed to be implemented as a wrapper around a single pointer, which should be pretty low overhead. And it does work, with no copies or moves of the contained elements, nor (thanks to the use of deque) any copies or moves of the forward_lists themselves (aside from when we empty them to splice them onto the output forward_list), and it only traverses the input forward_list once (destroying it by extracting the first node over and over until it's emptied).

It's not the prettiest, but it's not nearly as ugly or inefficient as I was expecting.

int main(int argc, char **argv)
{
    std::forward_list<std::string> args(&argv[1], &argv[argc]);
    std::deque<std::forward_list<std::string>> deq;  // Use deque so we never have to move existing forward_lists, and don't need to pre-walk to find size to reserve for vector
    while (!args.empty()) {
        auto& flist = deq.emplace_back();
        flist.splice_after(flist.cbefore_begin(), args, args.cbefore_begin());  // Extract a single node from input to populate new forward_list
    }

    std::shuffle(std::begin(deq), std::end(deq), std::mt19937(42));  // Shuffle with reproducible PRNG

    std::cout << "Shuffled deq:\n";
    for (auto& flist : deq) {
        std::cout << flist.front() << std::endl;
    }

    std::forward_list<std::string> shuffled_args;
    auto splice_loc = shuffled_args.cbefore_begin();
    for (auto&& flist : deq) {
        shuffled_args.splice_after(splice_loc, std::move(flist));  // splice single element forward_list contents onto end
        ++splice_loc;  // We added one element, move the iterator forward by one
    }

    std::cout << "Shuffled list:\n";
    for (const auto& s : shuffled_args) {
        std::cout << s << std::endl;
    }

    return 0;
}

Try it online!

To be clear, if anyone has any better solutions, I'm happy to hear about them and would love to give the checkmark to something cleaner.

Std::Forward_List and Std::Forward_List::Push_Back