Python Socket Receive Large Amount of Data

Python Socket Receive Large Amount of Data

TCP/IP is a stream-based protocol, not a message-based protocol. There's no guarantee that every send() call by one peer results in a single recv() call by the other peer receiving the exact data sent—it might receive the data piece-meal, split across multiple recv() calls, due to packet fragmentation.

You need to define your own message-based protocol on top of TCP in order to differentiate message boundaries. Then, to read a message, you continue to call recv() until you've read an entire message or an error occurs.

One simple way of sending a message is to prefix each message with its length. Then to read a message, you first read the length, then you read that many bytes. Here's how you might do that:

def send_msg(sock, msg):
# Prefix each message with a 4-byte length (network byte order)
msg = struct.pack('>I', len(msg)) + msg
sock.sendall(msg)

def recv_msg(sock):
# Read message length and unpack it into an integer
raw_msglen = recvall(sock, 4)
if not raw_msglen:
return None
msglen = struct.unpack('>I', raw_msglen)[0]
# Read the message data
return recvall(sock, msglen)

def recvall(sock, n):
# Helper function to recv n bytes or return None if EOF is hit
data = bytearray()
while len(data) < n:
packet = sock.recv(n - len(data))
if not packet:
return None
data.extend(packet)
return data

Then you can use the send_msg and recv_msg functions to send and receive whole messages, and they won't have any problems with packets being split or coalesced on the network level.

Python socket receive - incoming packets always have a different size

Note: As people have pointed out in the comments, calling recv() with no parameters is not allowed in Python, and so this answer should be disregarded.

Original answer:


The network is always unpredictable. TCP makes a lot of this random behavior go away for you. One wonderful thing TCP does: it guarantees that the bytes will arrive in the same order. But! It does not guarantee that they will arrive chopped up in the same way. You simply cannot assume that every send() from one end of the connection will result in exactly one recv() on the far end with exactly the same number of bytes.

When you say socket.recv(x), you're saying 'don't return until you've read x bytes from the socket'. This is called "blocking I/O": you will block (wait) until your request has been filled. If every message in your protocol was exactly 1024 bytes, calling socket.recv(1024) would work great. But it sounds like that's not true. If your messages are a fixed number of bytes, just pass that number in to socket.recv() and you're done.

But what if your messages can be of different lengths? The first thing you need to do: stop calling socket.recv() with an explicit number. Changing this:

data = self.request.recv(1024)

to this:

data = self.request.recv()

means recv() will always return whenever it gets new data.

But now you have a new problem: how do you know when the sender has sent you a complete message? The answer is: you don't. You're going to have to make the length of the message an explicit part of your protocol. Here's the best way: prefix every message with a length, either as a fixed-size integer (converted to network byte order using socket.ntohs() or socket.ntohl() please!) or as a string followed by some delimiter (like '123:'). This second approach often less efficient, but it's easier in Python.

Once you've added that to your protocol, you need to change your code to handle recv() returning arbitrary amounts of data at any time. Here's an example of how to do this. I tried writing it as pseudo-code, or with comments to tell you what to do, but it wasn't very clear. So I've written it explicitly using the length prefix as a string of digits terminated by a colon. Here you go:

length = None
buffer = ""
while True:
data += self.request.recv()
if not data:
break
buffer += data
while True:
if length is None:
if ':' not in buffer:
break
# remove the length bytes from the front of buffer
# leave any remaining bytes in the buffer!
length_str, ignored, buffer = buffer.partition(':')
length = int(length_str)

if len(buffer) < length:
break
# split off the full message from the remaining bytes
# leave any remaining bytes in the buffer!
message = buffer[:length]
buffer = buffer[length:]
length = None
# PROCESS MESSAGE HERE

What is the proper way of sending a large amount of data over sockets in Python?

I'm assuming that you have a particular reason for doing this with naked sockets, such as self-edification, which means that I won't answer by saying "You accidentally forgot to just use HTTP and Twisted", which perhaps you've heard before :-P. But really you should look at higher-level libraries at some point as they're a lot easier!

Define a protocol

If all you want is to send an image, then it can be simple:

  1. Client -> server: 8 bytes: big endian, length of image.
  2. Client -> server: length bytes: all image data.
  3. (Client <- server: 1 byte, value 0: indicate transmission received - optional step you may not care if you're using TCP and just assume that it's reliable.)

Code it

server.py

import os
from socket import *
from struct import unpack

class ServerProtocol:

def __init__(self):
self.socket = None
self.output_dir = '.'
self.file_num = 1

def listen(self, server_ip, server_port):
self.socket = socket(AF_INET, SOCK_STREAM)
self.socket.bind((server_ip, server_port))
self.socket.listen(1)

def handle_images(self):

try:
while True:
(connection, addr) = self.socket.accept()
try:
bs = connection.recv(8)
(length,) = unpack('>Q', bs)
data = b''
while len(data) < length:
# doing it in batches is generally better than trying
# to do it all in one go, so I believe.
to_read = length - len(data)
data += connection.recv(
4096 if to_read > 4096 else to_read)

# send our 0 ack
assert len(b'\00') == 1
connection.sendall(b'\00')
finally:
connection.shutdown(SHUT_WR)
connection.close()

with open(os.path.join(
self.output_dir, '%06d.jpg' % self.file_num), 'w'
) as fp:
fp.write(data)

self.file_num += 1
finally:
self.close()

def close(self):
self.socket.close()
self.socket = None

# could handle a bad ack here, but we'll assume it's fine.

if __name__ == '__main__':
sp = ServerProtocol()
sp.listen('127.0.0.1', 55555)
sp.handle_images()

client.py

from socket import *
from struct import pack

class ClientProtocol:

def __init__(self):
self.socket = None

def connect(self, server_ip, server_port):
self.socket = socket(AF_INET, SOCK_STREAM)
self.socket.connect((server_ip, server_port))

def close(self):
self.socket.shutdown(SHUT_WR)
self.socket.close()
self.socket = None

def send_image(self, image_data):

# use struct to make sure we have a consistent endianness on the length
length = pack('>Q', len(image_data))

# sendall to make sure it blocks if there's back-pressure on the socket
self.socket.sendall(length)
self.socket.sendall(image_data)

ack = self.socket.recv(1)

# could handle a bad ack here, but we'll assume it's fine.

if __name__ == '__main__':
cp = ClientProtocol()

image_data = None
with open('IMG_0077.jpg', 'r') as fp:
image_data = fp.read()

assert(len(image_data))
cp.connect('127.0.0.1', 55555)
cp.send_image(image_data)
cp.close()

Socket. How to receive all data with socket.recv()?

Your interpretation is wrong. Your code reads all the data that it get from the server. It just doesn't know that it should stop listening for incoming data. It doesn't know that the server sent everything it had.

First of all note that these lines

if part <buff_size:
break;

are very wrong. First of all you are comparing a string to int (in Python3.x that would throw an exception). But even if you meant if len(part) <buff_size: then this is still wrong. Because first of all there might be a lag in the middle of streaming and you will only read a piece smaller then buff_size. Your code will stop there.

Also if your server sends a content of the size being a multiple of buff_size then the if part will never be satisfied and it will hang on .recv() forever.

Side note: don't use semicolons ;. It's Python.


There are several solutions to your problem but none of them can be used correctly without modyfing the server side.

As a client you have to know when to stop reading. But the only way to know it is if the server does something special and you will understand it. This is called a communication protocol. You have to add a meaning to data you send/receive.

For example if you use HTTP, then a server sends this header Content-Length: 12345 before body so now as a client you know that you only need to read 12345 bytes (your buffer doesn't have to be as big, but with that info you will know how many times you have to loop before reading it all).

Some binary protocols may send the size of the content in first 2 or 4 bytes for example. This can be easily interpreted on the client side as well.

Easier solution is this: simply make server close the connection after he sends all the data. Then you will only need to add check if not part: break in your code.

how to receive large data from a broadcast server using python sockets

The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested. If you want to make sure you receive exactly 16384000 bytes of data, you have to implement it yourself. TCP doesn't preserve message boundaries. So you actually want something like this:

def recvall(sock, length):
data = b''
while length:
recved = sock.recv(length)
data += recved
length -= len(recved)
return data

Please note that this code hasn't been tested; it is also not an option for nonblocking sockets.

Python socket sends faster than receiver

TCP is a stream oriented protocol and don't send message one by one. A easy way to split the messages that you can set a split string in the end of message like \r\n

Example:

Client:

#!/usr/bin/env python

import socket

TCP_IP = '127.0.0.1'
TCP_PORT = 13005

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send('Message one\r\n')
s.send('Message two\r\n')
s.close()

Server:

#!/usr/bin/env python

import socket

TCP_IP = '127.0.0.1'
TCP_PORT = 13005
BUFFER_SIZE = 20 # Normally 1024, but we want test

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((TCP_IP, TCP_PORT))
s.listen(1)

conn, addr = s.accept()
data = ''
while 1:
data += conn.recv(BUFFER_SIZE)
if not data: break
if not data.endswith('\r\n'):
continue
lines = data.split('\r\n')
for line in lines:
print line
data = ''
conn.close()

If your message is complicated and long, you can see: Python Socket Receive Large Amount of Data



Related Topics



Leave a reply



Submit