Why Do I Need Strand Per Connection When Using Boost::Asio

Why do I need strand per connection when using boost::asio?

The documentation is correct. With a half duplex protocol implementation, such as HTTP Server 3, the strand is not necessary. The call chains can be illustrated as follows:

void connection::start()
{
  socket.async_receive_from(..., &handle_read);  ----.
}                                                    |
    .------------------------------------------------'
    |      .-----------------------------------------.
    V      V                                         |
void connection::handle_read(...)                    |
{                                                    |
  if (result)                                        |
    boost::asio::async_write(..., &handle_write); ---|--.
  else if (!result)                                  |  |
    boost::asio::async_write(..., &handle_write);  --|--|
  else                                               |  |
    socket_.async_read_some(..., &handle_read);  ----'  |
}                                                       |
    .---------------------------------------------------'
    |
    V
void handle_write(...)

As shown in the illustration, only a single asynchronous event is started per path. With no possibility of concurrent execution of the handlers or operations on socket_, it is said to be running in an implicit strand.

Thread Safety

While it does not present itself as an issue in the example, I would like to highlight one important detail of strands and composed operations, such as boost::asio::async_write. Before explaining the details, lets first cover the thread safety model with Boost.Asio. For most Boost.Asio objects, it is safe to have multiple asynchronous operations pending on an object; it is just specified that concurrent calls on the object are unsafe. In the diagrams below, each column represents a thread and each line represents what a thread is doing at a moment in time.

It is safe for a single thread to make sequential calls while other threads make none:

 thread_1                             | thread_2
--------------------------------------+---------------------------------------
socket.async_receive(...);            | ...
socket.async_write_some(...);         | ...

It is safe for multiple threads to make calls, but not concurrently:

 thread_1                             | thread_2
--------------------------------------+---------------------------------------
socket.async_receive(...);            | ...
...                                   | socket.async_write_some(...);

However, it is not safe for multiple threads to make calls concurrently¹:

 thread_1                             | thread_2
--------------------------------------+---------------------------------------
socket.async_receive(...);            | socket.async_write_some(...);
...                                   | ...

Strands

To prevent concurrent invocations, handlers are often invoked from within strands. This is done by either:

Wrapping the handler with strand.wrap. This will return a new handler, that will dispatch through the strand.
Posting or dispatching directly through the strand.

Composed operations are unique in that intermediate calls to the stream are invoked within the handler's strand, if one is present, instead of the strand in which the composed operation is initiated. When compared to other operations, this presents an inversion of where the strand is specified. Here is some example code focusing on strand usage, that will demonstrate a socket that is read from via a non-composed operation, and concurrently written to with a composed operation.

void start()
{
  // Start read and write chains.  If multiple threads have called run on
  // the service, then they may be running concurrently.  To protect the
  // socket, use the strand.
  strand_.post(&read);
  strand_.post(&write);
}

// read always needs to be posted through the strand because it invokes a
// non-composed operation on the socket.
void read()
{
  // async_receive is initiated from within the strand.  The handler does
  // not affect the strand in which async_receive is executed.
  socket_.async_receive(read_buffer_, &handle_read);
}

// This is not running within a strand, as read did not wrap it.
void handle_read()
{
  // Need to post read into the strand, otherwise the async_receive would
  // not be safe.
  strand_.post(&read);
}

// The entry into the write loop needs to be posted through a strand.
// All intermediate handlers and the next iteration of the asynchronous write
// loop will be running in a strand due to the handler being wrapped.
void write()
{
  // async_write will make one or more calls to socket_.async_write_some.
  // All intermediate handlers (calls after the first), are executed
  // within the handler's context (strand_).
  boost::asio::async_write(socket_, write_buffer_,
                           strand_.wrap(&handle_write));
}

// This will be invoked from within the strand, as it was a wrapped
// handler in write().
void handle_write()
{
  // handler_write() is invoked within a strand, so write() does not
  // have to dispatched through the strand.
  write();
}

Importance of Handler Types

Also, within composed operations, Boost.Asio uses argument dependent lookup (ADL) to invoke intermediate handlers through the completion handler's strand. As such, it is important that the completion handler's type has the appropriate asio_handler_invoke() hooks. If type erasure occurs to a type that does not have the appropriate asio_handler_invoke() hooks, such as a case where a boost::function is constructed from the return type of strand.wrap, then intermediate handlers will execute outside of the strand, and only the completion handler will execute within the strand. See this answer for more details.

In the following code, all intermediate handlers and the completion handler will execute within the strand:

boost::asio::async_write(stream, buffer, strand.wrap(&handle_write));

In the following code, only the completion handler will execute within the strand. None of the intermediate handlers will execute within the strand:

boost::function<void()> handler(strand.wrap(&handle_write));
boost::asio::async_write(stream, buffer, handler);

_{1. The revision history documents an anomaly to this rule. If supported by the OS, synchronous read, write, accept, and connection operations are thread safe. I an including it here for completeness, but suggest using it with caution.}

Boost.Asio - when is explicit strand wrapping needed when using make_strand

If an executor is not specified or bound, the "associated executor" is used.

For member async initiation functions the default executor is the one from the IO object. In your case it would be the socket which has been created "on" (with) the strand executor. In other words, socket.get_executor() already returns the strand<> executor.

Only when posting you would either need to specify the strand executor (or bind the handler to it, so it becomes the implicit default for the handler):

When must you pass io_context to boost::asio::spawn? (C++)
Why is boost::asio::io service designed to be used as a parameter?

Does mulithreaded http processing with boost asio require strands?

If the async chain of operations creates a logical strand, then often you don't need explicit strands.

Also, if the execution context is only ever run/polled from a single thread then all async operations will effective be on that implicit strand.

The examples serve more than one purpose.

On the one hand. they're obviously kept simple. Naturally there will be minimum number of threads or simplistic chains of operations.
However, that leads to over-simplified examples that have too little relation to real life.

Therefore, even if it's not absolutely required, the samples often show good practice or advanced patterns. Sometimes (often IME) this is even explicitly commented. E.g. in your very linked example L277:

 // We need to be executing within a strand to perform async operations
 // on the I/O objects in this session. Although not strictly necessary
 // for single-threaded contexts, this example code is written to be
 // thread-safe by default.
 net::dispatch(stream_.get_executor(),
               beast::bind_front_handler(
                   &session::do_read,
                   shared_from_this()));

Motivational example

This allows people to solve their next non-trivial task. For example, imagine you wanted to add stop() to the listener class from the linked example. There's no way to do that safely without a strand. You would need to "inject" a call to acceptor.cancel() inside the logical "strand", the async operation chain containing async_accept. But you can't, because async_accept is "logically blocking" that chain. So you actually do need to post to an explicit strand:

void stop() {
  post(acceptor_.get_executor(), [this] { acceptor_.cancel(); });
}

Am I paranoid while using boost:asio?

Yes, indeed. The documentation details this here:

Threads And Boost Asio
By only calling io_service::run() from a single thread, the user's code can avoid the development complexity associated with synchronisation. For example, a library user can implement scalable servers that are single-threaded (from the user's point of view).

Thinking a bit more broadly, your scenario is the simplest form of having a single logical strand. There are other ways in which you can maintain logical strands (by chaining handlers), see this most excellent answer on the subject: Why do I need strand per connection when using boost::asio?

asio::strandasio::io_context::executor_type vs io_context::strand

io_context::strand is "legacy". I assumed it exists for interface compatibility with code that still uses boost::asio::io_service (also deprecated).

As the comments reflect I've since found out that io_context::strand is not actually deprecated, although I see no reason why this is the case, and close reading of the implementation leaves me with the conclusion that
asio::strand<Executor> is strictly better
mixing both strand services is not the best idea. In fact both services are documented with the same tag line:
 // Default service implementation for a strand.
I can't help feeling there should be only one default :)

Modern strands don't reference an execution context, but wrap an executor.

While being technically different, it is conceptually the same.

The usage is the same for posting tasks:

post(s, task); // where s is either legacy or modern 
defer(s, task);
dispatch(s, task);

In fact you may have tasks with an associated executor, see:
When must you pass io_context to boost::asio::spawn? (C++)
Which io_context does std::boost::asio::post / dispatch use?

You can no longer use the legacy strand to construct IO service objects (like tcp::socket or steady_timer). That's because the legacy strand cannot be type-erased into any_io_executor:

using tcp = boost::asio::ip::tcp;
boost::asio::io_context ioc;

auto modern = make_strand(ioc.get_executor());
tcp::socket sock(modern);

boost::asio::io_context::strand legacy(ioc);
tcp::socket sock(legacy); // COMPILE ERROR

If you really want you can force it by not using any_io_executor:

boost::asio::basic_stream_socket<tcp, decltype(legacy)> sock(legacy);
sock.connect({{}, 8989});

Why Do I Need Strand Per Connection When Using Boost::Asio