Minimizing Copies When Writing Large Data to a Socket

Minimizing copies when writing large data to a socket

It seems like my suspicions were correct. I got my information from this article. Quoting from it:

Also these network write system calls, including sendfile, might and
in many cases do return before the data sent over TCP by the method
call has been acknowledged. These methods return as soon as all data
is written into the socket buffers (sk buff) and is pushed to the TCP
write queue, the TCP engine can manage alone from that point on. In
other words at the time sendfile returns the last TCP send window is
not actually sent to the remote host but queued. In cases where
scatter-gather DMA is supported there is no seperate buffer which
holds these bytes, rather the buffers(sk buffs) just hold pointers to
the pages of OS buffer cache, where the contents of file is located.
This might lead to a race condition if we modify the content of the
file corresponding to the data in the last TCP send window as soon as
sendfile is returned. As a result TCP engine may send newly written
data to the remote host instead of what we originally intended to
send.

Provided the buffer from a mmapped file is even considered "DMA-able", seems like there is no way to know when it is safe to re-use it without an explicit acknowledgement (over the network) from the actual client. I might have to stick to simple write calls and incur the extra copy. There is a paper (also from the article) with more details.

Edit: This article on the splice call also shows the problems. Quoting it:

Be aware, when splicing data from a mmap'ed buffer to a network
socket, it is not possible to say when all data has been sent. Even if
splice() returns, the network stack may not have sent all data yet. So
reusing the buffer may overwrite unsent data.

Reusing socket when copying large amounts of data

Do you know how much data you have before sending it over? If so, I'd basically length-prefix the message.

That's much easier to handle cleanly than using an end-of-stream token and having to worry about escaping, over-reading etc.

But yes, you'll need to do something, because TCP/IP is a stream-based protocol. Unless you have some indicator for the end of data somehow, you never know whether there might be some more to come Real Soon Now.

Golang writing to a socket without worrying about incomplete data

The canonical way to write bytes to a socket is:

_, err := conn.Write(msg)
if err != nil{
    // handle error
}

There's no need to loop because Write returns an error if the full buffer is not written. Write is different from Read in this regard. Read can succeed without filling the buffer.

Writing directly to socket vs to buffer

Applications buffer writes to a network connection because a single write call with a large buffer is more efficient than multiple write calls with small buffers.

Call SetNoDelay(false) to make the operating system delay packet transmission with the hope of reducing the number of packets.

There is no option to explicitly flush a TCP connection’s buffer.

Before writing your own utilities, take a look at the bufio.Writer type. Many applications use this type to buffer writes to TCP connections and files.

What is the expected behaviour if file is written/altered while sendfile() is in progress

The expected behavior is that the result is unpredictable. sendfile() is not atomic, it's effectively equivalent to writing your own loop that calls read() from the file descriptor and write() on the socket descriptor. If some other process writes to the file while this is going on, you'll get a mix of the old and new contents. Think of it mainly as a convenience function, although it also should be significantly more efficient since it doesn't require multiple system calls, and can copy directly between the file buffer and the socket buffer, rather than copying back and forth between application buffers. Because of this the window of vulnerability should be smaller than doing the read/write loop, but it's still there.

To avoid this problem, you should use file locking between the process(es) doing the writing and the one calling sendfile(). So the sequence should be:

lock the file
call sendfile()
unlock the file

and the writing process should do:

lock the file
write to the file
unlock the file

EDIT:

Actually, it looks like it isn't this simple, because sendfile() links the socket buffer to the file buffer cache, rather than copying it in the kernel. Since sendfile() doesn't wait for the data to be sent, modifying the file after it returns could still affect what gets sent. You need to check that the data has all been received with application-layer acknowledgements. See Minimizing copies when writing large data to a socket for an excerpt from an article that explains these details.

Transferring big data over Socket

while (socketChannel.write(byteBuffer) > 0) {
}

If you aren't in blocking mode, this loop will terminate as soon as the local socket send buffer fills up, which causes write() to return zero.

A simple but poor quality fix would be to change the loop condition from > to >=. However if write() returns zero you should really be using a Selector and OP_WRITE to detect when the channel becomes writable again.

Python frame data optimization before sending through socket

tl;dr: struct is actually slow.. Instead of pickle use np.ndarray.tobytes() combined with np.frombuffer() to eliminate overhead.

I'm not well versed in opencv, which is probably the best answer, but a drop-in approach to speeding up transfer could be to use struct to pack and unpack the data to be sent over the network instead of pickle.

Here's an example of sending a numpy array of known dimensions over a socket using struct

import numpy as np
import socket
import struct

#----- server ------
conn = socket.socket()
#connect socket somewhere
arr = np.random.randint(0,256,(320,240,3), dtype="B") # unsigned bytes "B": camera likely returns 0-255 pixel values
conn.write(struct.pack('230400B', *arr.flat)) #230400 unsigned bytes

#----- client ------
conn = socket.socket()
#connect socket somewhere
data = conn.read(230400) #read 230400 bytes
arr = np.array(struct.unpack('230400B', data), dtype='B').reshape((320,240,3),)

EDIT

A little digging shows numpy has a tobytes function that exposes a memory view of the data as a bytes object. This basically does the work of struct without needing the argument unpacking in the function call to encode. This prompted me to also see if we could do the unpacking too, and as long as you're okay with flying by the seat of your pants a little bit (interruptions or errors would not be caught gracefully), we can pack and unpack the data with almost zero overhead making the only limiting factor your network.

testing script:

arr = np.random.randint(0,256,(320,240,3), dtype="B") # unsigned bytes "B": camera likely returns 0-255 pixel values

t = time()
for _ in range(100):
    arr2 = pickle.loads(pickle.dumps(arr))
print(f'pickle pack, pickle unpack: {time()-t} sec')

t = time()
for _ in range(100):
    arr2 = np.array(struct.unpack('230400B', struct.pack('230400B', *arr.flat)), dtype='B').reshape((320,240,3),)
print(f'struct pack, struct unpack: {time()-t} sec')

t = time()
for _ in range(100):
    arr2 = np.array(struct.unpack('230400B', arr.tobytes()), dtype='B').reshape((320,240,3),)
print(f'numpy pack, struct unpack: {time()-t} sec')

t = time()
for _ in range(100):
    arr2 = np.frombuffer(arr.tobytes(), dtype="B").reshape((320,240,3),)
print(f'numpy pack, numpy unpack: {time()-t} sec')

prints:

pickle pack, pickle unpack: 0.005013704299926758 sec
struct pack, struct unpack: 3.558577299118042 sec
numpy pack, struct unpack: 1.2988512516021729 sec
numpy pack, numpy unpack: 0.0010025501251220703 sec

Minimizing Copies When Writing Large Data to a Socket