Determine If There Is Data Left on the Socket and Discard It

Determine if there is Data left on the socket and discard it

There is no knowledge at TCP level about what constitutes an application protocol message. There are, however, two most common ways to delimit messages in a TCP stream:

  • prefix messages with their size and read that many bytes, or
  • read till a certain sequence of bytes is found.

In this light, a generic TCP reader should provide two reading functions to be universally useful:

  • to read N bytes, and
  • to read till a delimiter has been read

A design similar to Tornado IOStream reading functions would do.

How do I tell a socket to throw away all pending data without closing it?

If you want to throw away all the data and instantly abort the connection, then you can set the SO_LINGER option and close.

setsockopt(fd, SOL_SOCKET, SO_LINGER,
&(struct linger){.l_onoff=1, l_linger=0}, sizeof(struct linger));

This will immediately reset the connection (it will send a RST segment).

If however you want to throw some data, and keep the connection, this is likely not doable. There's no API (that I know of) to tell the kernel: "ignore the next 100 bytes: I don't want read to unblock on them". So you'll just have to call read until you decide the data is usable again.

How to determine if all data is received in the socket buffer?

There is no way to determine whether OnRead has finished reading as by definition there is no end. This means that you should not simply send binary data without special information. You could, for example send the number of bytes first (as a, let's say, 4 byte unsigned integer) and then the bytes you wish to send.

On the receiver's side, you would first read, let's say, the 4 bytes and now you know how many bytes you can expect.

How to check amount of data available for a socket in C and Linux

Yes:

#include <sys/ioctl.h>

...

int count;
ioctl(fd, FIONREAD, &count);

How to clear data stored in a socket in python

From Steffen's suggestion, I am using recv calls to clear the buffer. Currently, I'm setting the socket to non-blocking and calling recv until a BlockingIOError. It would be much appreciated if anyone could point out a more graceful solution (that doesn't use exceptions).

connection.setblocking(False)       
while True:
try:
chunk = connection.recv(4096)
except BlockingIOError as b:
break
connection.setblocking(True)

TCP need to discard info on the buffer or make it faster

While Sam Varshavchik's suggestion of using a thread is good, there's another option.

You already set your socket to non-blocking with fcntl(new_socket, F_SETFL, O_NONBLOCK);. So, at each loop you should read everything there is to read and send everything there is to send. If you don't tie one-to-one the reading and writing, both sides will be able to catch up.

The main hint that you need to fix this is that you don't use the read return value, valread. Your code should look like:

while(true){ // main loop

...

valread = read( new_socket , buffer, 1024);
while(valread > 0)
{
// deal with the read values.
// deal with receiving more than one packet per iteration
}
// send code done a single time per loop.

There still plenty of architecture you need to have a clean resilient main loop that sends and receives, but I hope that points you in a useful direction.

How does the python socket.recv() method know that the end of the message has been reached?

It depends on the protocol. Some protocols like UDP send messages and exactly 1 message is returned per recv. Assuming you are talking about TCP specifically, there are several factors involved. TCP is stream oriented and because of things like the amount of currently outstanding send/recv data, lost/reordered packets on the wire, delayed acknowledgement of data, and the Nagle algorithm (which delays some small sends by a few hundred milliseconds), its behavior can change subtly as a conversation between client and server progresses.

All the receiver knows is that it is getting a stream of bytes. It could get anything from 1 to the fully requested buffer size on any recv. There is no one-to-one correlation between the send call on one side and the recv call on the other.

If you need to figure out message boundaries its up to the higher level protocols to figure that out. Take HTTP for example. It starts with a \r\n delimited header and then has a count of the remaining bytes the client should expect to receive. The client knows how to read the header because of the \r\n then knows exactly how many bytes are coming next. Part of the charm of RESTful protocols is that they are HTTP based and somebody else already figured this stuff out!

Some protocols use NUL to delimit messages. Others may have a fixed length binary header that includes a count of any variable data to come. I like zeromq which has a robust messaging system on top of TCP.

More details on what happens with receive...

When you do recv(1024), there are 6 possibilities

  1. There is no receive data. recv will wait until there is receive data. You can change that by setting a timeout.

  2. There is partial receive data. You'll get that part right away. The rest is either buffered or hasn't been sent yet and you just do another recv to get more (and the same rules apply).

  3. There is more than 1024 bytes available. You'll get 1024 of that data and the rest is buffered in the kernel waiting for another receive.

  4. The other side has shut down the socket. You'll get 0 bytes of data. 0 means you will never get more data on that socket. But if you keep asking for data, you'll keep getting 0 bytes.

  5. The other side has reset the socket. You'll get an exception.

  6. Some other strange thing has gone on and you'll get an exception for that.

How to be sure that write(2) has written all the data to the socket/file-descriptor?

size_t done;
ssize_t ret;

for (done = 0; done < size; done += ret) {
ret = write(client, buff + done, size-done);

if (ret == 0) return -ENODEV;
if (ret == -1 && errno == EINTR) { ret = 0; continue; }
if (ret == -1) return -errno;
}


Related Topics



Leave a reply



Submit