How to Encode a String to Base64 Using Only Boost

How do I encode a string to base64 using only boost?

I improved the example in the link you provided a little:

#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/archive/iterators/ostream_iterator.hpp>
#include <sstream>
#include <string>
#include <iostream>

int main()
{
using namespace boost::archive::iterators;

std::string test = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce ornare ullamcorper ipsum ac gravida.";

std::stringstream os;
typedef
insert_linebreaks< // insert line breaks every 72 characters
base64_from_binary< // convert binary values to base64 characters
transform_width< // retrieve 6 bit integers from a sequence of 8 bit bytes
const char *,
6,
8
>
>
,72
>
base64_text; // compose all the above operations in to a new iterator

std::copy(
base64_text(test.c_str()),
base64_text(test.c_str() + test.size()),
ostream_iterator<char>(os)
);

std::cout << os.str();
}

This prints the string encoded base64 nicely formated with a line break every 72 characters onto the console, ready to be put into an email. If you don't like the linebreaks, just stay with this:

    typedef 
base64_from_binary<
transform_width<
const char *,
6,
8
>
>
base64_text;

Base64 encode using boost throw exception

Unfortunately the combination of the two iterator_adaptors binary_from_base64 and transform_width is not a complete base64 encoder/decoder. Base64 represents groups of 24 bits (3 bytes) as 4 characters, each of which encodes 6 bits. If the input data is not an integer multiple of such 3 byte groups it has to be padded with one or two zero bytes. To indicate how many padding bytes were added, one or two = characters are appended to the encoded string.

transform_width, which is responsible for the 8bit binary to 6bit integer conversion does not apply this padding automatically, it has do be done by the user. A simple example:

#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/remove_whitespace.hpp>
#include <iostream>
#include <string>

using namespace boost::archive::iterators;
using namespace std;

int main(int argc, char **argv) {
typedef transform_width< binary_from_base64<remove_whitespace<string::const_iterator> >, 8, 6 > it_binary_t;
typedef insert_linebreaks<base64_from_binary<transform_width<string::const_iterator,6,8> >, 72 > it_base64_t;
string s;
getline(cin, s, '\n');
cout << "Your string is: '"<<s<<"'"<<endl;

// Encode
unsigned int writePaddChars = (3-s.length()%3)%3;
string base64(it_base64_t(s.begin()),it_base64_t(s.end()));
base64.append(writePaddChars,'=');

cout << "Base64 representation: " << base64 << endl;

// Decode
unsigned int paddChars = count(base64.begin(), base64.end(), '=');
std::replace(base64.begin(),base64.end(),'=','A'); // replace '=' by base64 encoding of '\0'
string result(it_binary_t(base64.begin()), it_binary_t(base64.end())); // decode
result.erase(result.end()-paddChars,result.end()); // erase padding '\0' characters
cout << "Decoded: " << result << endl;
return 0;
}

Note that I added the insert_linebreaks and remove_whitespace iterators, so that the base64 output is nicely formatted and base64 input with line breaks can be decoded. These are optional though.

Run with different input strings which require different padding:

$ ./base64example
Hello World!
Your string is: 'Hello World!'
Base64 representation: SGVsbG8gV29ybGQh
Decoded: Hello World!
$ ./base64example
Hello World!!
Your string is: 'Hello World!!'
Base64 representation: SGVsbG8gV29ybGQhIQ==
Decoded: Hello World!!
$ ./base64example
Hello World!!!
Your string is: 'Hello World!!!'
Base64 representation: SGVsbG8gV29ybGQhISE=
Decoded: Hello World!!!

You can check the base64 strings with this online-encoder/decoder.

Decode Base64 String Using Boost

base64 requires both input and output to be padded into multiples of 3 and 4 respectively.

Here's a function for decoding base64 using boost:

#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/remove_whitespace.hpp>
#include <algorithm>

std::string decode(std::string input)
{
using namespace boost::archive::iterators;
typedef transform_width<binary_from_base64<remove_whitespace
<std::string::const_iterator> >, 8, 6> ItBinaryT;

try
{
// If the input isn't a multiple of 4, pad with =
size_t num_pad_chars((4 - input.size() % 4) % 4);
input.append(num_pad_chars, '=');

size_t pad_chars(std::count(input.begin(), input.end(), '='));
std::replace(input.begin(), input.end(), '=', 'A');
std::string output(ItBinaryT(input.begin()), ItBinaryT(input.end()));
output.erase(output.end() - pad_chars, output.end());
return output;
}
catch (std::exception const&)
{
return std::string("");
}
}

It was taken from here, where an encoding function with padding using boost can also be found.

Base64 encode with Stream_StringToBinary inserts a newline, breaking the string?

This is because of how the Base64 encoding deals with long strings.

From RFC 2045 - 6.8 Base64 Content-Transfer-Encoding
The encoded output stream must be represented in lines of no more
than 76 characters each
. All line breaks or other characters not
found in Table 1 must be ignored by decoding software. In base64
data, characters other than those in Table 1, line breaks, and other
white space probably indicate a transmission error, about which a
warning message or even a message rejection might be appropriate
under some circumstances.

Because it is adding the vbLf (Chr(10)) after the encode should mean you are safe to just remove it using

strEnc = Replace(strEnc, vbLf, "")

Some languages have a "no wrapping" argument that can be passed to stop the Linefeed being added after the 76th character but I don't know of one in the Microsoft XMLDOM implementation, noted here Base64 -- do we really want/need line breaks every 76 characters? it looks as though it was suggested but there is no evidence it was ever implemented.

How can you encode a string to Base64 in JavaScript?

You can use btoa() and atob() to convert to and from base64 encoding.

There appears to be some confusion in the comments regarding what these functions accept/return, so…

  • btoa() accepts a “string” where each character represents an 8-bit byte – if you pass a string containing characters that can’t be represented in 8 bits, it will probably break. This isn’t a problem if you’re actually treating the string as a byte array, but if you’re trying to do something else then you’ll have to encode it first.

  • atob() returns a “string” where each character represents an 8-bit byte – that is, its value will be between 0 and 0xff. This does not mean it’s ASCII – presumably if you’re using this function at all, you expect to be working with binary data and not text.

See also:

  • How do I load binary image data using Javascript and XMLHttpRequest?

Most comments here are outdated. You can probably use both btoa() and atob(), unless you support really outdated browsers.

Check here:

  • https://caniuse.com/?search=atob
  • https://caniuse.com/?search=btoa

Why do I need 'b' to encode a string with Base64?

base64 encoding takes 8-bit binary byte data and encodes it uses only the characters A-Z, a-z, 0-9, +, /* so it can be transmitted over channels that do not preserve all 8-bits of data, such as email.

Hence, it wants a string of 8-bit bytes. You create those in Python 3 with the b'' syntax.

If you remove the b, it becomes a string. A string is a sequence of Unicode characters. base64 has no idea what to do with Unicode data, it's not 8-bit. It's not really any bits, in fact. :-)

In your second example:

>>> encoded = base64.b64encode('data to be encoded')

All the characters fit neatly into the ASCII character set, and base64 encoding is therefore actually a bit pointless. You can convert it to ascii instead, with

>>> encoded = 'data to be encoded'.encode('ascii')

Or simpler:

>>> encoded = b'data to be encoded'

Which would be the same thing in this case.


* Most base64 flavours may also include a = at the end as padding. In addition, some base64 variants may use characters other than + and /. See the Variants summary table at Wikipedia for an overview.



Related Topics



Leave a reply



Submit