Broken Tcp Messages

Broken TCP messages

You need message framing. The protocol must specify how large the messages are - usually either a constant well-known size, a length prefix, or using message delimiters.

Proper way of breaking a TCP stream into messages (in C )

You have 2 problems in your code:

the sender (client prog) exits as soon as the last send call returns. As you try to send 20*3000 bytes, you can expect (when you use a true network and not the loopback interface on a single machine) than a number of bytes have just been queued for transfer and have not yet been received at that moment. But the end of the client program will abruptly close the socket and the queued bytes will not be sent at all
the receiver expects all the 60000 bytes to be received and never tests for an early peer closure on the socket. If it happens (and because of the problem sender side, it is to be expected), the receiver will fall in an endless loop reading 0 bytes from a socket which has already closed by the sender.

What to do:

the reciever should test for a 0 bytes read. If it happens it means that nothing will ever come from the socket and it should immediately abort with an error message. At least the problem will be easier to diagnose.
the sender should not abruptly close its socket but instead use a graceful shutdown: when everything has been sent, it should use shutdown on the socket notifying the peer that nothing more will be sent, and wait for the receiver to close the socket when everything has correctly be transmitted. It is enough to use a blocking read of a few bytes: the read will nicely block until the peer closes its socket and will then return a 0 bytes read.

Possible for a TCP socket to break so that it's still receiving but can no longer send?

I think it's likely a routing problem. The other comments above regarding unidirectional traffic only apply in the face of a shutdown(2) which presumably you would be aware of, since your application has to do that explicitly.

The routing could have been different in the two directions (as @RonMaupin noted). Or it could be that there was simply a large amount of congestion in one direction at some intermediate router. Either situation can result in packet drops.

In the face of dropped packets like this, the two sides will continue to retry their transmissions due to not receiving ACKs (which I think you've correctly described). The initial retransmission time is based on the approximate round-trip time calculated by each endpoint machine. Then there is an exponential backoff for subsequent retransmits -- see for example http://www.pcvr.nl/tcpip/tcp_time.htm#21_2 for explanation. The result is an eventual timeout.

Given the exponential backoff and some number of retransmits (that number is platform-specific and often configurable), it typically takes longer than 18 seconds before your local network stack declares a session dead. But it sounds like your application may have short-circuited this process with its own timeout (which seems reasonable for a game server).

I suspect you've never seen this before because in general the route is the same in both directions, and when a router is "down", it's down in both directions.

C TCP cannot detect broken connection

If the peer of a TCP connection closes the connection, it will lead to a recv call on your end to return 0. That's the way to detect closed (but not broken) connections.

If you don't currently receive anything from the peer, you need to make up a protocol on top of TCP which includes receiving data.

Furthermore, sending might not detect broken connections (like missing cables etc.) directly, as there are a lot of retransmissions and timeouts. The best way is again to implement some kind of protocol overlaying TCP, one that for example contains a kind of "are you there" message which expects a reply. If a reply to the "are you there" message isn't received within some specific timeout, then consider the connection broken and disconnect.

How to parse broken messages received through TCP

There is no such thing in TCP as "message". It is a stream-oriented protocol. Of course, at lower levels it is transmitted in separate packets, but you have no way to control it and what you are seeing can be different from those packets. You just read as much as available in the receiving buffer at any particular moment. You may perceive your messages as broken down, but you may as well encounter a situation where several messages arrive as combined into one piece.

So when reading a message you should either use some sort of delimiter to figure out where your message ends, or use a header with message length. If you are sending simple strings, encoding them as UTF-8 and terminating them with null bytes should work fine. For more complicated things you'll need more complicated approach, obviously.

About the issue that the character string is broken in TCP/IP communication between different machines

There is no line break added by transmission of the data. This line break is instead added by the server code:

            print(data.decode('utf-8')+"\n")

Both the print itself causes a line break and then you also add another one.

In general you are assuming that each send has a matching recv. This assumption is wrong. TCP is a byte stream and not a message stream and the payloads from multiple send might be merged together to reduce the overhead of sending and it might also cause a "split" into a single "message".

This is especially true when sending traffic between machines since the bandwidth between the machines is less than the local bandwidth and the MTU of the data layer is also much smaller.

Given that you have to first collect your "messages" at the server side. Only after you've got a complete "message" (whatever this is in your case) you should decode('utf-8'). Otherwise your code might crash when trying to decode a character which has a multi-byte UTF-8 encoding but where not all bytes were received yet.

TCP: Improving reliability with a broken connection

If both ends doesn't explicitly disconnect, the tcp connection will stay open forever even if you unplug the cable. There is no timeout in TCP.

However, I would use (or design) an application protocol on top of tcp, making it possible to resume data transmission after re-connects. You may use HTTP for example.

That would be much more stable because depending on buffers would, as you say, at some time exhaust the buffers but the buffers would also being lost on let's say a power outage.

java TCP socket message breaks

Try replacing your receive with a Scanner and let it do the work for you.

// in your setup
Scanner sc = new Scanner(_in).useDelimiter(Connection.DELIMETER);

public String receive() {
    try {
        return sc.next();
    } catch(IOException e) {
        return "";
    }

}

How do I account for messages being broken up when using sockets?

Yes, this is something you need to handle in your protocol.

The two most typical approaches here are:

Make your protocol line-oriented. Terminate every message with a newline, and don't treat a line as complete until you see that newline character. This, of course, depends on newlines not naturally appearing in messages.
Some protocols which use this approach include SMTP, IMAP, and IRC.
Include the length of the message in its header, so that you know how much data to read.
Some protocols which use this approach include HTTP (in the Content-Length header) and TLS, as well as many low-level protocols such as IP.

If you aren't sure which approach to take, the second one is considerably easier to implement, and doesn't place any restrictions on what data you use it with. A simple implementation might simply store the count of bytes as a packed integer, and could look like the following pseudocode:

send_data(dat):
    send(length of dat as packed integer)
    send(dat)

recv_data():
    size = recv(size of packed integer)
    return recv(buffer)

(This code assumes that the abstract send() and recv() methods will block until the entire message is sent or received. Your code will, of course, have to make this work appropriately.)

Broken Tcp Messages