An Efficient Way to Base64 Encode a Byte Array

An efficient way to Base64 encode a byte array?

Here is the code to base64 encode directly to byte array (tested to be performing +-10% of .Net Implementation, but allocates half the memory):

    static public void testBase64EncodeToBuffer()
{
for (int i = 1; i < 200; ++i)
{
// prep test data
byte[] testData = new byte[i];
for (int j = 0; j < i; ++j)
testData[j] = (byte)(j ^ i);

// test
testBase64(testData);
}
}

static void testBase64(byte[] data)
{
if (!appendBase64(data, 0, data.Length, false).SequenceEqual(System.Text.Encoding.ASCII.GetBytes(Convert.ToBase64String(data)))) throw new Exception("Base 64 encoding failed");
}

static public byte[] appendBase64(byte[] data
, int offset
, int size
, bool addLineBreaks = false)
{
byte[] buffer;
int bufferPos = 0;
int requiredSize = (4 * ((size + 2) / 3));
// size/76*2 for 2 line break characters
if (addLineBreaks) requiredSize += requiredSize + (requiredSize / 38);

buffer = new byte[requiredSize];

UInt32 octet_a;
UInt32 octet_b;
UInt32 octet_c;
UInt32 triple;
int lineCount = 0;
int sizeMod = size - (size % 3);
// adding all data triplets
for (; offset < sizeMod;)
{
octet_a = data[offset++];
octet_b = data[offset++];
octet_c = data[offset++];

triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

buffer[bufferPos++] = base64EncodingTable[(triple >> 3 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 2 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 1 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 0 * 6) & 0x3F];
if (addLineBreaks)
{
if (++lineCount == 19)
{
buffer[bufferPos++] = 13;
buffer[bufferPos++] = 10;
lineCount = 0;
}
}
}

// last bytes
if (sizeMod < size)
{
octet_a = offset < size ? data[offset++] : (UInt32)0;
octet_b = offset < size ? data[offset++] : (UInt32)0;
octet_c = (UInt32)0; // last character is definitely padded

triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

buffer[bufferPos++] = base64EncodingTable[(triple >> 3 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 2 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 1 * 6) & 0x3F];
buffer[bufferPos++] = base64EncodingTable[(triple >> 0 * 6) & 0x3F];

// add padding '='
sizeMod = size % 3;
// last character is definitely padded
buffer[bufferPos - 1] = (byte)'=';
if (sizeMod == 1) buffer[bufferPos - 2] = (byte)'=';
}
return buffer;
}

advantage of converting to Base 64 encoding?

You generally convert to base64 when you are passing binary data (which could be any byte value from 0 to 255) over a text protocol. e.g. in JSON, XML, Email.

Why do Base64.decode produce same byte array for different strings?

The issue you are seeing, is caused by the fact that the number of bytes you have in the "result" (11 bytes) doesn't completely "fill" the last char of the Base64 encoded string.

Remember that Base64 encodes each 8 bit entity into 6 bit chars. The resulting string then needs exactly 11 * 8 / 6 bytes, or 14 2/3 chars. But you can't write partial characters. Only the first 4 bits (or 2/3 of the last char) are significant. The last two bits are not decoded. Thus all of:

dGVzdCBzdHJpbmo
dGVzdCBzdHJpbmp
dGVzdCBzdHJpbmq
dGVzdCBzdHJpbmr

All decode to the same 11 bytes (116, 101, 115, 116, 32, 115, 116, 114, 105, 110, 106).

PS: Without padding, some decoders will try to decode the "last" byte as well, and you'll have a 12 byte result (with different last byte). This is the reason for my comment (asking if withoutPadding() option is a good idea). But your decoder seems to handle this.

How to convert unsigned byte array to base64 string in ctypes

Try this:

import ctypes
import base64

image = (ctypes.c_ubyte * s.dwDataLen)()
ctypes.memmove(image, s.pBuffer, s.dwDataLen)
# Convert the image to an array of bytes
buffer = bytearray(image)
encoded = base64.encodebytes(buffer)

If you are using base64.b64encode, you should be able to pass image to it:

import ctypes
import base64

image = (ctypes.c_ubyte * s.dwDataLen)()
ctypes.memmove(image, s.pBuffer, s.dwDataLen)
encoded = base64.b64encode(image)


Related Topics



Leave a reply



Submit