How to (De)Construct Data Frames in Websockets Hybi 08+

How to (de)construct data frames in WebSockets hybi 08+?

(See also: How can I send and receive WebSocket messages on the server side?)


It's fairly easy, but it's important to understand the format.

The first byte is almost always 1000 0001, where the 1 means "last frame", the three 0s are reserved bits without any meaning so far and the 0001 means that it's a text frame (which Chrome sends with the ws.send() method).

(Update: Chrome can now also send binary frames with an ArrayBuffer. The last four bits of the first byte will be 0002, so you can differ between text and binary data. The decoding of the data works exactly the same way.)

The second byte contains of a 1 (meaning that it's "masked" (encoded)) followed by seven bits which represent the frame size. If it's between 000 0000 and 111 1101, that's the size. If it's 111 1110, the following 2 bytes are the length (because it wouldn't fit in seven bits), and if it's 111 1111, the following 8 bytes are the length (if it wouldn't fit in two bytes either).

Following that are four bytes which are the "masks" which you need to decode the frame data. This is done using xor encoding which uses one of the masks as defined by indexOfByteInData mod 4 of the data. Decoding simply works like encodedByte xor maskByte (where maskByte is indexOfByteInData mod 4).

Now I must say I'm not experienced with C# at all, but this is some pseudocode (some JavaScript accent I'm afraid):

var length_code = bytes[1] & 127, // remove the first 1 by doing '& 127'
masks,
data;

if(length_code === 126) {
masks = bytes.slice(4, 8); // 'slice' returns part of the byte array
data = bytes.slice(8); // and accepts 'start' (inclusively)
} else if(length_code === 127) { // and 'end' (exclusively) as arguments
masks = bytes.slice(10, 14); // Passing no 'end' makes 'end' the length
data = bytes.slice(14); // of the array
} else {
masks = bytes.slice(2, 6);
data = bytes.slice(6);
}

// 'map' replaces each element in the array as per a specified function
// (each element will be replaced with what is returned by the function)
// The passed function accepts the value and index of the element as its
// arguments
var decoded = data.map(function(byte, index) { // index === 0 for the first byte
return byte ^ masks[ index % 4 ]; // of 'data', not of 'bytes'
// xor mod
});

You can also download the specification which can be helpful (it of course contains everything you need to understand the format).

PHP Websocket Server hybi10

i just completed a class wich makes the PHP-Websocket-Server of Nico Kaiser (https://github.com/nicokaiser/php-websocket) capable of handling hybi-10 frames and handshake. You can download the new class here: http://lemmingzshadow.net/386/php-websocket-serverclient-nach-draft-hybi-10/ (Connection.php)

WebSocket Server using latest protocol (hybi 10)

So I solved my particular issue with the handshake, and it was quite noobish. I needed two sets of "\r\n" to close out the handshake. So to fix the handshake issue I described above (the Javascript WebSocket not going to the OPEN state) I needed to make the following change to my server-side PHP (note the \r\n\r\n at the end, doh):

function dohandshake($user,$buffer){
// getheaders and calcKey are confirmed working, can provide source if desired
list($resource,$host,$origin,$key,$version) = $this->getheaders($buffer);
$request = "HTTP/1.1 101 Switching Protocols\r\n" .
"Upgrade: WebSocket\r\n" .
"Connection: Upgrade\r\n" .
"Sec-WebSocket-Accept: " . $this->calcKey($key) . "\r\n\r\n";
socket_write($user->socket,$request);
$user->handshake=true;
return true;
}

Also for future PHP-WebSocket enthusiasts I just use regular expressions to parse the header in getheaders and this is calcKey:

function calcKey($key){
$CRAZY = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
$sha = sha1($key.$CRAZY,true);
return base64_encode($sha);
}

Hope this helps someone else! Now to work on the message framing...

Data encoding in new protocol handshake hybi-10

The way data is framed in HyBi (HyBi-00 is really Hixie-76) is significantly changed. The new frame format is described in this diagram.

Also, for data that is sent from the client to the server, the data is masked. The mask is the first 4 bytes of the frame payload and is decoded (and encoded actually) in place using this simple algorithm:

data[i] = data[i] XOR mask[j MOD 4]

The mask key is different with every frame which is why you are getting a different payload every time even though you are sending the same data.

If the client isn't receiving data that you sent, chances are that you aren't framing the data right. Also note that Chrome 14 and Firefox 6/7 do not yet support binary data so the opcode needs to be 1 to denote a text (UTF-8) frame.

How can I send and receive WebSocket messages on the server side?

Note: This is some explanation and pseudocode as to how to implement a very trivial server that can handle incoming and outcoming WebSocket messages as per the definitive framing format. It does not include the handshaking process. Furthermore, this answer has been made for educational purposes; it is not a full-featured implementation.

Specification (RFC 6455)



Sending messages

(In other words, server → browser)

The frames you're sending need to be formatted according to the WebSocket framing format. For sending messages, this format is as follows:

  • one byte which contains the type of data (and some additional info which is out of scope for a trivial server)
  • one byte which contains the length
  • either two or eight bytes if the length does not fit in the second byte (the second byte is then a code saying how many bytes are used for the length)
  • the actual (raw) data

The first byte will be 1000 0001 (or 129) for a text frame.

The second byte has its first bit set to 0 because we're not encoding the data (encoding from server to client is not mandatory).

It is necessary to determine the length of the raw data so as to send the length bytes correctly:

  • if 0 <= length <= 125, you don't need additional bytes
  • if 126 <= length <= 65535, you need two additional bytes and the second byte is 126
  • if length >= 65536, you need eight additional bytes, and the second byte is 127

The length has to be sliced into separate bytes, which means you'll need to bit-shift to the right (with an amount of eight bits), and then only retain the last eight bits by doing AND 1111 1111 (which is 255).

After the length byte(s) comes the raw data.

This leads to the following pseudocode:

bytesFormatted[0] = 129

indexStartRawData = -1 // it doesn't matter what value is
// set here - it will be set now:

if bytesRaw.length <= 125
bytesFormatted[1] = bytesRaw.length

indexStartRawData = 2

else if bytesRaw.length >= 126 and bytesRaw.length <= 65535
bytesFormatted[1] = 126
bytesFormatted[2] = ( bytesRaw.length >> 8 ) AND 255
bytesFormatted[3] = ( bytesRaw.length ) AND 255

indexStartRawData = 4

else
bytesFormatted[1] = 127
bytesFormatted[2] = ( bytesRaw.length >> 56 ) AND 255
bytesFormatted[3] = ( bytesRaw.length >> 48 ) AND 255
bytesFormatted[4] = ( bytesRaw.length >> 40 ) AND 255
bytesFormatted[5] = ( bytesRaw.length >> 32 ) AND 255
bytesFormatted[6] = ( bytesRaw.length >> 24 ) AND 255
bytesFormatted[7] = ( bytesRaw.length >> 16 ) AND 255
bytesFormatted[8] = ( bytesRaw.length >> 8 ) AND 255
bytesFormatted[9] = ( bytesRaw.length ) AND 255

indexStartRawData = 10

// put raw data at the correct index
bytesFormatted.put(bytesRaw, indexStartRawData)

// now send bytesFormatted (e.g. write it to the socket stream)


Receiving messages

(In other words, browser → server)

The frames you obtain are in the following format:

  • one byte which contains the type of data
  • one byte which contains the length
  • either two or eight additional bytes if the length did not fit in the second byte
  • four bytes which are the masks (= decoding keys)
  • the actual data

The first byte usually does not matter - if you're just sending text you are only using the text type. It will be 1000 0001 (or 129) in that case.

The second byte and the additional two or eight bytes need some parsing, because you need to know how many bytes are used for the length (you need to know where the real data starts). The length itself is usually not necessary since you have the data already.

The first bit of the second byte is always 1 which means the data is masked (= encoded). Messages from the client to the server are always masked. You need to remove that first bit by doing secondByte AND 0111 1111. There are two cases in which the resulting byte does not represent the length because it did not fit in the second byte:

  • a second byte of 0111 1110, or 126, means the following two bytes are used for the length
  • a second byte of 0111 1111, or 127, means the following eight bytes are used for the length

The four mask bytes are used for decoding the actual data that has been sent. The algorithm for decoding is as follows:

decodedByte = encodedByte XOR masks[encodedByteIndex MOD 4]

where encodedByte is the original byte in the data, encodedByteIndex is the index (offset) of the byte counting from the first byte of the real data, which has index 0. masks is an array containing of the four mask bytes.

This leads to the following pseudocode for decoding:

secondByte = bytes[1]

length = secondByte AND 127 // may not be the actual length in the two special cases

indexFirstMask = 2 // if not a special case

if length == 126 // if a special case, change indexFirstMask
indexFirstMask = 4

else if length == 127 // ditto
indexFirstMask = 10

masks = bytes.slice(indexFirstMask, 4) // four bytes starting from indexFirstMask

indexFirstDataByte = indexFirstMask + 4 // four bytes further

decoded = new array

decoded.length = bytes.length - indexFirstDataByte // length of real data

for i = indexFirstDataByte, j = 0; i < bytes.length; i++, j++
decoded[j] = bytes[i] XOR masks[j MOD 4]

// now use "decoded" to interpret the received data

WebSocket Server in Java (hybi 10) sending and receiving

The problem is that the length does not always fit in 7 bits (you can only express the numbers 0 to 127 with 7 bits), and in that case either the following 2 or 8 bytes will be used to make the length fit:

  • 126 means the following 2 bytes are used for the length
  • 127 means the following 8 bytes are used for the length

So the payload starts at either index 2, 4 or 10, if not encoded. When encoded, it starts at either 6, 8 or 14 (because there are 4 mask bytes).

I previously posted some pseudocode about decoding the payload data.


To actually get the length as a "real number" (instead of separate bytes), you can use bitwise shift operators as follows (in case there are two bytes for the length):

var length = (bytes[2] << 8) | (bytes[3] << 0);

This will calculate it like this:

Suppose:

  • bytes[2] is 01101001 (105 in base 10)
  • bytes[3] is 10100101 (165 in base 10)

Then << will be doing:

01101001 00000000   // moved 8 places to the left, filled with zeroes
10100101 // moved 0 places (nothing really happening, you can eliminate '<< 0')

| is basically adding them:

01101001 00000000
10100101
----------------- |
01101001 10100101 (in base 10 that's 27045)

So if you have the bytes 105 and 165, then they represent a length of 27045.

WebSocket, decoding data frame (c++)

Your data looks OK, the first byte (-127 in dec, or 0x81 or 1000 0001).

When reading with GetBit you use 4 bits per byte in stead of 8.

biIdx currently starts on the most right bit going to the most left bit. This should be the other way around:

int GetBit(const char * data, unsigned int idx)
{
unsigned int arIdx = idx / 8;
unsigned int biIdx = idx % 8;
return (data[arIdx] >> (7 - biIdx)) & 1;
}

That should get you the correct bits.

For the MASK, you should read bit 8. As specified: https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_servers

WebSocketServers in C# and JS - Gibberish in return

WebSocket data is framed, so you have to read frame by frame and extract the data from it.

0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7) | (16/64) |
|N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+

Writing a WebSocket server in C#

Writing WebSocket servers



Related Topics



Leave a reply



Submit