Outputting More Things Than a Polymorphic Text Archive

Outputting more things than a Polymorphic Text Archive

First Off: Streams Are Not Archives.

My first reaction would be "have you tried". But, I was intrigued and couldn't find anything about this in the documentation, so I did a few tests myself:

  • the answer seems to be "No", it's not supported
  • it seems to work for binary archives
  • it seems to break down because the xml/text archives leave trailing 0xa characters in the input buffer. These will not pose a problem if the "next" archive to be read is text as well, but obviously break binary archives.

Here's my tester:

Live On Coliru

#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/xml_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/xml_oarchive.hpp>

int data = 42;

template <typename Ar>
void some_output(std::ostream& os)
{
std::cout << "Writing archive at " << os.tellp() << "\n";
Ar ar(os);
ar << BOOST_SERIALIZATION_NVP(data);
}

template <typename Ar>
void roundtrip(std::istream& is)
{
data = -1;
std::cout << "Reading archive at " << is.tellg() << "\n";
Ar ar(is);
ar >> BOOST_SERIALIZATION_NVP(data);
assert(data == 42);
}

#include <sstream>

int main()
{
std::stringstream ss;

//some_output<boost::archive::text_oarchive>(ss); // this derails the binary archive that follows
some_output<boost::archive::binary_oarchive>(ss);
some_output<boost::archive::xml_oarchive>(ss);
some_output<boost::archive::text_oarchive>(ss);

//roundtrip<boost::archive::text_iarchive>(ss);
roundtrip<boost::archive::binary_iarchive>(ss);
roundtrip<boost::archive::xml_iarchive>(ss);
roundtrip<boost::archive::text_iarchive>(ss);

// just to prove that there's remaining whitespace
std::cout << "remaining: ";
char ch;
while (ss>>std::noskipws>>ch)
std::cout << " " << std::showbase << std::hex << ((int)(ch));
std::cout << "\n";

// of course, anything else will fail:
try {
roundtrip<boost::archive::text_iarchive>(ss);
} catch(boost::archive::archive_exception const& e)
{
std::cout << "Can't deserialize from a stream a EOF: " << e.what();
}
}

Prints:

Writing archive at 0
Writing archive at 44
Writing archive at 242
Reading archive at 0
Reading archive at 44
Reading archive at 240
remaining: 0xa
Reading archive at 0xffffffffffffffff
Can't deserialize from a stream a EOF: input stream error

Boost: Re-using/clearing text_iarchive for de-serializing data from Asio:receive()

No there is not such a way.

The comparison to MemoryStream is broken though, because the archive is a layer above the stream.

You can re-use the stream. So if you do the exact parallel of a MemoryStream, e.g. boost::iostreams::array_sink and/or boost::iostreams::array_source on a fixed buffer, you can easily reuse the buffer in you next (de)serialization.

See this proof of concept:

Live On Coliru

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/serialization.hpp>

#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>
#include <sstream>

namespace bar = boost::archive;
namespace bio = boost::iostreams;

struct Packet {
int i;
template <typename Ar> void serialize(Ar& ar, unsigned) { ar & i; }
};

namespace Reader {
template <typename T>
Packet deserialize(T const* data, size_t size) {
static_assert(boost::is_pod<T>::value , "T must be POD");
static_assert(boost::is_integral<T>::value, "T must be integral");
static_assert(sizeof(T) == sizeof(char) , "T must be byte-sized");

bio::stream<bio::array_source> stream(bio::array_source(data, size));
bar::text_iarchive ia(stream);
Packet result;
ia >> result;

return result;
}

template <typename T, size_t N>
Packet deserialize(T (&arr)[N]) {
return deserialize(arr, N);
}

template <typename T>
Packet deserialize(std::vector<T> const& v) {
return deserialize(v.data(), v.size());
}

template <typename T, size_t N>
Packet deserialize(boost::array<T, N> const& a) {
return deserialize(a.data(), a.size());
}
}

template <typename MutableBuffer>
void serialize(Packet const& data, MutableBuffer& buf)
{
bio::stream<bio::array_sink> s(buf.data(), buf.size());
bar::text_oarchive ar(s);

ar << data;
}

int main() {
boost::array<char, 1024> arr;

for (int i = 0; i < 100; ++i) {
serialize(Packet { i }, arr);

Packet roundtrip = Reader::deserialize(arr);
assert(roundtrip.i == i);
}
std::cout << "Done\n";
}

For general optimization of boost serialization see:

  • how to do performance test using the boost library for a custom library
  • Boost C++ Serialization overhead
  • Boost Serialization Binary Archive giving incorrect output
  • Tune things (boost::archive::no_codecvt, boost::archive::no_header, disable tracking etc.)
  • Outputting more things than a Polymorphic Text Archive and Streams Are Not Archives

Serialization of Boost c++

The clean answer for Boost Serialization would be

std::ofstream readFile("Categories.txt");
{
boost::archive::text_oarchive ar(readFile);

ar << vec;
ar << another_object;
ar << yet_another_object;
} // note destructs `ar`
readFile.close();

Note that you should make sure the archive is complete before you close the file. Sadly, that is handled by the aarchive destructor so you always get this clumsy life-time dance.

Of course, the file will close automatically when it leaves scope, so you could probably leave the readFile.close() out.

On combining multiple archives in a single stream:

  • Outputting more things than a Polymorphic Text Archive

Low bandwidth performance using boost::asio::ip::tcp::iostream

Your time likely is wasted in the serialization to/from text.

Dropping in binary archive does increase the speed from 80Mbit/s to 872MBit/s for me:

Client start to send message
Client send message
Client shutdown
Received: Hello World!
3

The total time in seconds is reduced to 3s, which happens to be the initial sleep :)

Proof Of Concept Live On Coliru

#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/asio.hpp>
#include <boost/serialization/export.hpp>
#include <boost/serialization/shared_ptr.hpp>
#include <boost/serialization/vector.hpp>
#include <chrono>
#include <iostream>
#include <sstream>
#include <string>
#include <thread>

using namespace std;

class Message {
public:
Message() {}

virtual ~Message() {}

string text;
std::vector<int> bigLoad;

private:
friend class boost::serialization::access;

template <class Archive> void serialize(Archive &ar, const unsigned int /*version*/) {
ar & text & bigLoad;
}
};

BOOST_CLASS_EXPORT(Message)

void runClient() {
// Give server time to startup
this_thread::sleep_for(chrono::seconds(1));

boost::asio::ip::tcp::iostream stream("127.0.0.1", "3000");
const boost::asio::ip::tcp::no_delay option(false);
stream.rdbuf()->set_option(option);

Message message;

stringstream ss;
ss << "Hello World!";
message.text = ss.str();

int items = 8 << 20;

for (int i = 0; i < items; i++)
message.bigLoad.push_back(i);

boost::archive::binary_oarchive archive(stream);

cout << "Client start to send message" << endl;
try {
archive << message;
} catch (std::exception &ex) {
cout << ex.what() << endl;
}
cout << "Client send message" << endl;

stream.close();
cout << "Client shutdown" << endl;
}

void handleIncommingClientConnection(boost::asio::ip::tcp::acceptor &acceptor) {
boost::asio::ip::tcp::iostream stream;
// const boost::asio::ip::tcp::no_delay option(false);
// stream.rdbuf()->set_option(option);

acceptor.accept(*stream.rdbuf());

boost::archive::binary_iarchive archive(stream);

{
try {
Message message;
archive >> message;
cout << "Received: " << message.text << endl;
} catch (std::exception &ex) {
cout << ex.what() << endl;

if (stream.eof()) {
cout << "eof" << endl;
stream.close();
cout << "Server: shutdown client handling..." << endl;
return;
} else
throw;
}
}
}

void runServer() {
using namespace boost::asio;
using ip::tcp;

io_service ios;
tcp::endpoint endpoint = tcp::endpoint(tcp::v4(), 3000);
tcp::acceptor acceptor(ios, endpoint);

handleIncommingClientConnection(acceptor);
}

template <typename TimeT = std::chrono::milliseconds> struct measure {
template <typename F, typename... Args> static typename TimeT::rep execution(F &&func, Args &&... args) {
auto start = std::chrono::steady_clock::now();
std::forward<decltype(func)>(func)(std::forward<Args>(args)...);
auto duration = std::chrono::duration_cast<TimeT>(std::chrono::steady_clock::now() - start);
return duration.count();
}
};

void doIt() {
thread clientThread(runClient);
thread serverThread(runServer);

clientThread.join();
serverThread.join();
}

int main() { std::cout << measure<std::chrono::seconds>::execution(doIt) << std::endl; }

Caution:

One thing is "lost" here, that wasn't really supported with the old version of the code, either: receiving multiple archives directly head to head.

You might want to device some kind of framing protocol. See e.g.

  • Boost Serialization Binary Archive giving incorrect output
  • Outputting more things than a Polymorphic Text Archive
  • Streams Are Not Archives (http://www.boost.org/doc/libs/1_51_0/libs/serialization/doc/)

I've done a number of "overhead of Boost Serialization" posts on here:

  • how to do performance test using the boost library for a custom library
  • Boost C++ Serialization overhead

Deriving custom archive classes from boost::archive::text_oarchive_impl and boost::archive::text_iarchive_impl

As best I can tell its a bug in boost serialize. We'll see here.

A)
1. Adding BOOST_SERIALIZATION_REGISTER_ARCHIVE with you new archive does not work because the default text archives have already been registered - only on registration seems to be allowed.

2. Removing them makes it work because only your custom classes are registered.

3. By removing them you've broken the ability to use the default text archive - your classes will be registered.

B)

I'm fairly certain that the "text_?archive.hpp" files should have been split up like the "binary_?archive.hpp" files are. Patch boost anyone?

C)

The best solution is to submit a patch to boost that splits the files up. For a temporary solution probably the best way is to put the patched files locally in your project until the patch makes it into boost.

boost serialization and register_type

i figured out how to avoid calling register_type on archive. For those who might be interested, it is needed to do template serialization specialization as well as exporting key + implement.

So here is what your .hpp should look like :

  • class declaration (mynamespace::myclass)
  • class export : BOOST_CLASS_EXPORT_KEY(mynamespace::myclass)

And in the cpp:

  • class definiton
  • class export : BOOST_CLASS_EXPORT_IMPLEMENT(mynamespace::myclass)
  • AND : serialize() member specialization on the archive you need to use, for each class :

template void mynamespace::mypacket::serialize(boost::archive::text_iarchive& arch, const unsigned int version);

template void mynamespace::mypacket::serialize(boost::archive::text_oarchive& arch, const unsigned int version);

Where boost::archive::text_(i/o)archive should be replaced with whatever kind of boost archive you are using.

In hope it will help someone someday (this is clearly written in the boost documentation, but i must have missed it till today...)

How essential is polymorphism for writing a text editor?

The other points about polymorphism as being just a tool are spot on.

However if "the guy" did have some experience with writing text editors he may well have been talking about using polymorphism in the implementation of a document composition hierarchy.

Basically this is just a tree of objects that represent the structure of your document including details such as formatting (bold, italic etc) coloring and so on.

(Most web browsers implement something similar in the form of the browser Document Object Model (DOM), although there is certainly no requirement that they use polymorphism.)

Each of these objects inherits from a common base class (often abstract) that defines a method such as Compose().

Then when it is time to display or to update the structure of the document, the code simply traverses the tree calling the concrete Compose() on each object. Each object is then repsonsible for composing and rendering itself at the appropriate location in the document.

This is a classic use of polymorphism because it allows new document "components" to be added (or changed) without any (or minimal) change to the main application code.

Once again though, there are many ways to build a text manipulation program, polymorphism is definitely not required to build one.



Related Topics



Leave a reply



Submit