How to Do Performance Test Using the Boost Library for a Custom Library

Boost: Re-using/clearing text_iarchive for de-serializing data from Asio:receive()

No there is not such a way.

The comparison to MemoryStream is broken though, because the archive is a layer above the stream.

You can re-use the stream. So if you do the exact parallel of a MemoryStream, e.g. boost::iostreams::array_sink and/or boost::iostreams::array_source on a fixed buffer, you can easily reuse the buffer in you next (de)serialization.

See this proof of concept:

Live On Coliru

#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/serialization.hpp>

#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>
#include <sstream>

namespace bar = boost::archive;
namespace bio = boost::iostreams;

struct Packet {
int i;
template <typename Ar> void serialize(Ar& ar, unsigned) { ar & i; }
};

namespace Reader {
template <typename T>
Packet deserialize(T const* data, size_t size) {
static_assert(boost::is_pod<T>::value , "T must be POD");
static_assert(boost::is_integral<T>::value, "T must be integral");
static_assert(sizeof(T) == sizeof(char) , "T must be byte-sized");

bio::stream<bio::array_source> stream(bio::array_source(data, size));
bar::text_iarchive ia(stream);
Packet result;
ia >> result;

return result;
}

template <typename T, size_t N>
Packet deserialize(T (&arr)[N]) {
return deserialize(arr, N);
}

template <typename T>
Packet deserialize(std::vector<T> const& v) {
return deserialize(v.data(), v.size());
}

template <typename T, size_t N>
Packet deserialize(boost::array<T, N> const& a) {
return deserialize(a.data(), a.size());
}
}

template <typename MutableBuffer>
void serialize(Packet const& data, MutableBuffer& buf)
{
bio::stream<bio::array_sink> s(buf.data(), buf.size());
bar::text_oarchive ar(s);

ar << data;
}

int main() {
boost::array<char, 1024> arr;

for (int i = 0; i < 100; ++i) {
serialize(Packet { i }, arr);

Packet roundtrip = Reader::deserialize(arr);
assert(roundtrip.i == i);
}
std::cout << "Done\n";
}

For general optimization of boost serialization see:

  • how to do performance test using the boost library for a custom library
  • Boost C++ Serialization overhead
  • Boost Serialization Binary Archive giving incorrect output
  • Tune things (boost::archive::no_codecvt, boost::archive::no_header, disable tracking etc.)
  • Outputting more things than a Polymorphic Text Archive and Streams Are Not Archives

Low bandwidth performance using boost::asio::ip::tcp::iostream

Your time likely is wasted in the serialization to/from text.

Dropping in binary archive does increase the speed from 80Mbit/s to 872MBit/s for me:

Client start to send message
Client send message
Client shutdown
Received: Hello World!
3

The total time in seconds is reduced to 3s, which happens to be the initial sleep :)

Proof Of Concept Live On Coliru

#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/asio.hpp>
#include <boost/serialization/export.hpp>
#include <boost/serialization/shared_ptr.hpp>
#include <boost/serialization/vector.hpp>
#include <chrono>
#include <iostream>
#include <sstream>
#include <string>
#include <thread>

using namespace std;

class Message {
public:
Message() {}

virtual ~Message() {}

string text;
std::vector<int> bigLoad;

private:
friend class boost::serialization::access;

template <class Archive> void serialize(Archive &ar, const unsigned int /*version*/) {
ar & text & bigLoad;
}
};

BOOST_CLASS_EXPORT(Message)

void runClient() {
// Give server time to startup
this_thread::sleep_for(chrono::seconds(1));

boost::asio::ip::tcp::iostream stream("127.0.0.1", "3000");
const boost::asio::ip::tcp::no_delay option(false);
stream.rdbuf()->set_option(option);

Message message;

stringstream ss;
ss << "Hello World!";
message.text = ss.str();

int items = 8 << 20;

for (int i = 0; i < items; i++)
message.bigLoad.push_back(i);

boost::archive::binary_oarchive archive(stream);

cout << "Client start to send message" << endl;
try {
archive << message;
} catch (std::exception &ex) {
cout << ex.what() << endl;
}
cout << "Client send message" << endl;

stream.close();
cout << "Client shutdown" << endl;
}

void handleIncommingClientConnection(boost::asio::ip::tcp::acceptor &acceptor) {
boost::asio::ip::tcp::iostream stream;
// const boost::asio::ip::tcp::no_delay option(false);
// stream.rdbuf()->set_option(option);

acceptor.accept(*stream.rdbuf());

boost::archive::binary_iarchive archive(stream);

{
try {
Message message;
archive >> message;
cout << "Received: " << message.text << endl;
} catch (std::exception &ex) {
cout << ex.what() << endl;

if (stream.eof()) {
cout << "eof" << endl;
stream.close();
cout << "Server: shutdown client handling..." << endl;
return;
} else
throw;
}
}
}

void runServer() {
using namespace boost::asio;
using ip::tcp;

io_service ios;
tcp::endpoint endpoint = tcp::endpoint(tcp::v4(), 3000);
tcp::acceptor acceptor(ios, endpoint);

handleIncommingClientConnection(acceptor);
}

template <typename TimeT = std::chrono::milliseconds> struct measure {
template <typename F, typename... Args> static typename TimeT::rep execution(F &&func, Args &&... args) {
auto start = std::chrono::steady_clock::now();
std::forward<decltype(func)>(func)(std::forward<Args>(args)...);
auto duration = std::chrono::duration_cast<TimeT>(std::chrono::steady_clock::now() - start);
return duration.count();
}
};

void doIt() {
thread clientThread(runClient);
thread serverThread(runServer);

clientThread.join();
serverThread.join();
}

int main() { std::cout << measure<std::chrono::seconds>::execution(doIt) << std::endl; }

Caution:

One thing is "lost" here, that wasn't really supported with the old version of the code, either: receiving multiple archives directly head to head.

You might want to device some kind of framing protocol. See e.g.

  • Boost Serialization Binary Archive giving incorrect output
  • Outputting more things than a Polymorphic Text Archive
  • Streams Are Not Archives (http://www.boost.org/doc/libs/1_51_0/libs/serialization/doc/)

I've done a number of "overhead of Boost Serialization" posts on here:

  • how to do performance test using the boost library for a custom library
  • Boost C++ Serialization overhead

the fastest way to load data in C++

  1. speed comparisons here (how to do performance test using the boost library for a custom library)
  2. size trade-offs Boost C++ Serialization overhead (also with compression)
  3. EOS Portable Archive (EPA) for portable binary archives

That said, deserialization can be slow, depending on the types deserialized.
Speed depends on a lot of factors, quite possibly unrelated to the serialization library used.

  • Some data structures have costly insertion performance characteristics (see if you can reserve capacity/load with hints etc)
  • you might have a lot of dynamic allocation (consider trying e.g. Boost's flat_map for contiguous storage, or load unsorted and sort data when load is completed etc.)
  • you might have non-inlined (virtual) dispatching - prefer loading/store POD types in simple containers

You will have to profile your code to find out what is the performance bottleneck.

Using C++ Boost's Graph Library

Here's a simple example, using an adjacency list and executing a topological sort:

#include <iostream>
#include <deque>
#include <iterator>

#include "boost/graph/adjacency_list.hpp"
#include "boost/graph/topological_sort.hpp"

int main()
{
// Create a n adjacency list, add some vertices.
boost::adjacency_list<> g(num tasks);
boost::add_vertex(0, g);
boost::add_vertex(1, g);
boost::add_vertex(2, g);
boost::add_vertex(3, g);
boost::add_vertex(4, g);
boost::add_vertex(5, g);
boost::add_vertex(6, g);

// Add edges between vertices.
boost::add_edge(0, 3, g);
boost::add_edge(1, 3, g);
boost::add_edge(1, 4, g);
boost::add_edge(2, 1, g);
boost::add_edge(3, 5, g);
boost::add_edge(4, 6, g);
boost::add_edge(5, 6, g);

// Perform a topological sort.
std::deque<int> topo_order;
boost::topological_sort(g, std::front_inserter(topo_order));

// Print the results.
for(std::deque<int>::const_iterator i = topo_order.begin();
i != topo_order.end();
++i)
{
cout << tasks[v] << endl;
}

return 0;
}

I agree that the boost::graph documentation can be intimidating, but it's worth having a look.

I can't recall if the contents of the printed book is the same, I suspect it's a bit easier on the eyes. I actually learnt to use boost:graph from the book. The learning curve can feel pretty steep though. The book I refer to and reviews can be found here.

Boost Graph Library: Adding vertices with same identification

Sure. As long as the name is not the identifier (identity implies unique).

The whole idea of filesystem paths is that the paths are unique. So, what you would probably want is to have the unique name be the path to the node, and when displaying, choose what part of the path you want to display.

For an elegant demonstration using internal vertex names¹:

using G = boost::adjacency_list<boost::vecS, boost::vecS, boost::directedS, fs::path>;
using V = G::vertex_descriptor;

Now you can add any path to the graph:

void add_path(fs::path p, G& g) {
if (p.empty()) return;
if (!p.has_root_path()) p = fs::absolute(p);

std::optional<V> prev;
fs::path curr;
for (auto const& el : p) {
curr /= el;
auto v = add_vertex(curr, g);
if (prev)
add_edge(*prev, v, g);
prev = v;
}
}

We'll have to tell BGL to use std::identity to get the internal name from fs::path:

template <> struct boost::graph::internal_vertex_name<fs::path> {
struct type : std::identity {
using result_type = fs::path;
};
};

Now, demonstrating:

G g;

add_path("/root/a/a/a/a/a", g);
add_path("test.cpp", g);

To print using the vertex ids:

print_graph(g);

To print using the unique node paths:

auto paths = get(boost::vertex_bundle, g);
print_graph(g, paths);

To print using only the local names:

auto filename = std::mem_fn(&fs::path::filename);
auto local = make_transform_value_property_map(filename, paths);
print_graph(g, local);

Live Demo

Live On Compiler Explorer

#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_utility.hpp>
#include <boost/property_map/transform_value_property_map.hpp>
#include <filesystem>
using std::filesystem::path;

template <>
struct boost::graph::internal_vertex_name<path> {
struct type : std::identity {
using result_type = path;
};
};

using G =
boost::adjacency_list<boost::vecS, boost::vecS, boost::directedS, path>;
using V = G::vertex_descriptor;

void add_path(path p, G& g) {
if (p.empty()) return;
if (!p.has_root_path()) p = absolute(p);

std::optional<V> prev;
path curr;
for (auto const& el : p) {
curr /= el;
auto v = add_vertex(curr, g);
if (prev) add_edge(*prev, v, g);
prev = v;
}
}

int main() {
G g;

add_path("/root/a/a/a/a/a", g);
add_path("test.cpp", g);

// To print using the vertex ids:
print_graph(g, std::cout << " ---- vertex index\n");

// To print using the unique node paths:
auto paths = get(boost::vertex_bundle, g);
print_graph(g, paths, std::cout << " --- node path\n");

// To print using only the local names:
auto filename = std::mem_fn(&path::filename);
auto local = make_transform_value_property_map(filename, paths);
print_graph(g, local, std::cout << " --- local name\n");
}

Prints (on my machine, where test.cpp exists in /home/sehe/Projects/stackoverflow):

 ---- vertex index
0 --> 1 7
1 --> 2
2 --> 3
3 --> 4
4 --> 5
5 --> 6
6 -->
7 --> 8
8 --> 9
9 --> 10
10 --> 11
11 -->
--- node path
"/" --> "/root" "/home"
"/root" --> "/root/a"
"/root/a" --> "/root/a/a"
"/root/a/a" --> "/root/a/a/a"
"/root/a/a/a" --> "/root/a/a/a/a"
"/root/a/a/a/a" --> "/root/a/a/a/a/a"
"/root/a/a/a/a/a" -->
"/home" --> "/home/sehe"
"/home/sehe" --> "/home/sehe/Projects"
"/home/sehe/Projects" --> "/home/sehe/Projects/stackoverflow"
"/home/sehe/Projects/stackoverflow" --> "/home/sehe/Projects/stackoverflow/test.cpp"
"/home/sehe/Projects/stackoverflow/test.cpp" -->
--- local name
"" --> "root" "home"
"root" --> "a"
"a" --> "a"
"a" --> "a"
"a" --> "a"
"a" --> "a"
"a" -->
"home" --> "sehe"
"sehe" --> "Projects"
"Projects" --> "stackoverflow"
"stackoverflow" --> "test.cpp"
"test.cpp" -->

BONUS

Graphviz output:

write_graphviz(std::cout, g, boost::label_writer{local});

Gives this graphviz

Sample Image

¹ see e.g. How to configure boost::graph to use my own (stable) index for vertices?

Boost Graph Library: edge insertion slow for large graph

adjacency_list is very general purpose; unfortunately it's never going to be as efficient as a solution exploiting the regularity of your particular use-case could be. BGL isn't magic.

Your best bet is probably to come up with the efficient graph representation you'd use in the absence of BGL (hint: for a graph of an image's neighbouring pixels, this is not going to explicitly allocate all those node and edge objects) and then fit BGL to it (example), or equivalently just directly implement a counterpart to the existing adjacency_list / adjacency_matrix templates (concept guidelines) tuned to the regularities of your system.

By an optimised representation, I of course mean one in which you don't actually store all the nodes and edges explicitly but just have some way of iterating over enumerations of the implicit nodes and edges arising from the fact that the image is a particular size. The only thing you should really need to store is an array of edge weights.



Related Topics



Leave a reply



Submit