Boost: Re-using/clearing text_iarchive for de-serializing data from Asio:receive()
No there is not such a way.
The comparison to MemoryStream
is broken though, because the archive is a layer above the stream.
You can re-use the stream. So if you do the exact parallel of a MemoryStream, e.g. boost::iostreams::array_sink
and/or boost::iostreams::array_source
on a fixed buffer, you can easily reuse the buffer in you next (de)serialization.
See this proof of concept:
Live On Coliru
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/stream.hpp>
#include <sstream>
namespace bar = boost::archive;
namespace bio = boost::iostreams;
struct Packet {
int i;
template <typename Ar> void serialize(Ar& ar, unsigned) { ar & i; }
};
namespace Reader {
template <typename T>
Packet deserialize(T const* data, size_t size) {
static_assert(boost::is_pod<T>::value , "T must be POD");
static_assert(boost::is_integral<T>::value, "T must be integral");
static_assert(sizeof(T) == sizeof(char) , "T must be byte-sized");
bio::stream<bio::array_source> stream(bio::array_source(data, size));
bar::text_iarchive ia(stream);
Packet result;
ia >> result;
return result;
}
template <typename T, size_t N>
Packet deserialize(T (&arr)[N]) {
return deserialize(arr, N);
}
template <typename T>
Packet deserialize(std::vector<T> const& v) {
return deserialize(v.data(), v.size());
}
template <typename T, size_t N>
Packet deserialize(boost::array<T, N> const& a) {
return deserialize(a.data(), a.size());
}
}
template <typename MutableBuffer>
void serialize(Packet const& data, MutableBuffer& buf)
{
bio::stream<bio::array_sink> s(buf.data(), buf.size());
bar::text_oarchive ar(s);
ar << data;
}
int main() {
boost::array<char, 1024> arr;
for (int i = 0; i < 100; ++i) {
serialize(Packet { i }, arr);
Packet roundtrip = Reader::deserialize(arr);
assert(roundtrip.i == i);
}
std::cout << "Done\n";
}
For general optimization of boost serialization see:
- how to do performance test using the boost library for a custom library
- Boost C++ Serialization overhead
- Boost Serialization Binary Archive giving incorrect output
- Tune things (
boost::archive::no_codecvt
,boost::archive::no_header
, disable tracking etc.) - Outputting more things than a Polymorphic Text Archive and Streams Are Not Archives
Low bandwidth performance using boost::asio::ip::tcp::iostream
Your time likely is wasted in the serialization to/from text.
Dropping in binary archive does increase the speed from 80Mbit/s to 872MBit/s for me:
Client start to send message
Client send message
Client shutdown
Received: Hello World!
3
The total time in seconds is reduced to 3s, which happens to be the initial sleep :)
Proof Of Concept Live On Coliru
#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/archive/text_oarchive.hpp>
#include <boost/asio.hpp>
#include <boost/serialization/export.hpp>
#include <boost/serialization/shared_ptr.hpp>
#include <boost/serialization/vector.hpp>
#include <chrono>
#include <iostream>
#include <sstream>
#include <string>
#include <thread>
using namespace std;
class Message {
public:
Message() {}
virtual ~Message() {}
string text;
std::vector<int> bigLoad;
private:
friend class boost::serialization::access;
template <class Archive> void serialize(Archive &ar, const unsigned int /*version*/) {
ar & text & bigLoad;
}
};
BOOST_CLASS_EXPORT(Message)
void runClient() {
// Give server time to startup
this_thread::sleep_for(chrono::seconds(1));
boost::asio::ip::tcp::iostream stream("127.0.0.1", "3000");
const boost::asio::ip::tcp::no_delay option(false);
stream.rdbuf()->set_option(option);
Message message;
stringstream ss;
ss << "Hello World!";
message.text = ss.str();
int items = 8 << 20;
for (int i = 0; i < items; i++)
message.bigLoad.push_back(i);
boost::archive::binary_oarchive archive(stream);
cout << "Client start to send message" << endl;
try {
archive << message;
} catch (std::exception &ex) {
cout << ex.what() << endl;
}
cout << "Client send message" << endl;
stream.close();
cout << "Client shutdown" << endl;
}
void handleIncommingClientConnection(boost::asio::ip::tcp::acceptor &acceptor) {
boost::asio::ip::tcp::iostream stream;
// const boost::asio::ip::tcp::no_delay option(false);
// stream.rdbuf()->set_option(option);
acceptor.accept(*stream.rdbuf());
boost::archive::binary_iarchive archive(stream);
{
try {
Message message;
archive >> message;
cout << "Received: " << message.text << endl;
} catch (std::exception &ex) {
cout << ex.what() << endl;
if (stream.eof()) {
cout << "eof" << endl;
stream.close();
cout << "Server: shutdown client handling..." << endl;
return;
} else
throw;
}
}
}
void runServer() {
using namespace boost::asio;
using ip::tcp;
io_service ios;
tcp::endpoint endpoint = tcp::endpoint(tcp::v4(), 3000);
tcp::acceptor acceptor(ios, endpoint);
handleIncommingClientConnection(acceptor);
}
template <typename TimeT = std::chrono::milliseconds> struct measure {
template <typename F, typename... Args> static typename TimeT::rep execution(F &&func, Args &&... args) {
auto start = std::chrono::steady_clock::now();
std::forward<decltype(func)>(func)(std::forward<Args>(args)...);
auto duration = std::chrono::duration_cast<TimeT>(std::chrono::steady_clock::now() - start);
return duration.count();
}
};
void doIt() {
thread clientThread(runClient);
thread serverThread(runServer);
clientThread.join();
serverThread.join();
}
int main() { std::cout << measure<std::chrono::seconds>::execution(doIt) << std::endl; }
Caution:
One thing is "lost" here, that wasn't really supported with the old version of the code, either: receiving multiple archives directly head to head.
You might want to device some kind of framing protocol. See e.g.
- Boost Serialization Binary Archive giving incorrect output
- Outputting more things than a Polymorphic Text Archive
- Streams Are Not Archives (http://www.boost.org/doc/libs/1_51_0/libs/serialization/doc/)
I've done a number of "overhead of Boost Serialization" posts on here:
- how to do performance test using the boost library for a custom library
- Boost C++ Serialization overhead
the fastest way to load data in C++
- speed comparisons here (how to do performance test using the boost library for a custom library)
- size trade-offs Boost C++ Serialization overhead (also with compression)
- EOS Portable Archive (EPA) for portable binary archives
That said, deserialization can be slow, depending on the types deserialized.
Speed depends on a lot of factors, quite possibly unrelated to the serialization library used.
- Some data structures have costly insertion performance characteristics (see if you can reserve capacity/load with hints etc)
- you might have a lot of dynamic allocation (consider trying e.g. Boost's
flat_map
for contiguous storage, or load unsorted and sort data when load is completed etc.) - you might have non-inlined (virtual) dispatching - prefer loading/store POD types in simple containers
You will have to profile your code to find out what is the performance bottleneck.
Using C++ Boost's Graph Library
Here's a simple example, using an adjacency list and executing a topological sort:
#include <iostream>
#include <deque>
#include <iterator>
#include "boost/graph/adjacency_list.hpp"
#include "boost/graph/topological_sort.hpp"
int main()
{
// Create a n adjacency list, add some vertices.
boost::adjacency_list<> g(num tasks);
boost::add_vertex(0, g);
boost::add_vertex(1, g);
boost::add_vertex(2, g);
boost::add_vertex(3, g);
boost::add_vertex(4, g);
boost::add_vertex(5, g);
boost::add_vertex(6, g);
// Add edges between vertices.
boost::add_edge(0, 3, g);
boost::add_edge(1, 3, g);
boost::add_edge(1, 4, g);
boost::add_edge(2, 1, g);
boost::add_edge(3, 5, g);
boost::add_edge(4, 6, g);
boost::add_edge(5, 6, g);
// Perform a topological sort.
std::deque<int> topo_order;
boost::topological_sort(g, std::front_inserter(topo_order));
// Print the results.
for(std::deque<int>::const_iterator i = topo_order.begin();
i != topo_order.end();
++i)
{
cout << tasks[v] << endl;
}
return 0;
}
I agree that the boost::graph documentation can be intimidating, but it's worth having a look.
I can't recall if the contents of the printed book is the same, I suspect it's a bit easier on the eyes. I actually learnt to use boost:graph from the book. The learning curve can feel pretty steep though. The book I refer to and reviews can be found here.
Boost Graph Library: Adding vertices with same identification
Sure. As long as the name is not the identifier (identity implies unique).
The whole idea of filesystem paths is that the paths are unique. So, what you would probably want is to have the unique name be the path to the node, and when displaying, choose what part of the path you want to display.
For an elegant demonstration using internal vertex names¹:
using G = boost::adjacency_list<boost::vecS, boost::vecS, boost::directedS, fs::path>;
using V = G::vertex_descriptor;
Now you can add any path to the graph:
void add_path(fs::path p, G& g) {
if (p.empty()) return;
if (!p.has_root_path()) p = fs::absolute(p);
std::optional<V> prev;
fs::path curr;
for (auto const& el : p) {
curr /= el;
auto v = add_vertex(curr, g);
if (prev)
add_edge(*prev, v, g);
prev = v;
}
}
We'll have to tell BGL to use std::identity
to get the internal name from fs::path
:
template <> struct boost::graph::internal_vertex_name<fs::path> {
struct type : std::identity {
using result_type = fs::path;
};
};
Now, demonstrating:
G g;
add_path("/root/a/a/a/a/a", g);
add_path("test.cpp", g);
To print using the vertex ids:
print_graph(g);
To print using the unique node paths:
auto paths = get(boost::vertex_bundle, g);
print_graph(g, paths);
To print using only the local names:
auto filename = std::mem_fn(&fs::path::filename);
auto local = make_transform_value_property_map(filename, paths);
print_graph(g, local);
Live Demo
Live On Compiler Explorer
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/graph_utility.hpp>
#include <boost/property_map/transform_value_property_map.hpp>
#include <filesystem>
using std::filesystem::path;
template <>
struct boost::graph::internal_vertex_name<path> {
struct type : std::identity {
using result_type = path;
};
};
using G =
boost::adjacency_list<boost::vecS, boost::vecS, boost::directedS, path>;
using V = G::vertex_descriptor;
void add_path(path p, G& g) {
if (p.empty()) return;
if (!p.has_root_path()) p = absolute(p);
std::optional<V> prev;
path curr;
for (auto const& el : p) {
curr /= el;
auto v = add_vertex(curr, g);
if (prev) add_edge(*prev, v, g);
prev = v;
}
}
int main() {
G g;
add_path("/root/a/a/a/a/a", g);
add_path("test.cpp", g);
// To print using the vertex ids:
print_graph(g, std::cout << " ---- vertex index\n");
// To print using the unique node paths:
auto paths = get(boost::vertex_bundle, g);
print_graph(g, paths, std::cout << " --- node path\n");
// To print using only the local names:
auto filename = std::mem_fn(&path::filename);
auto local = make_transform_value_property_map(filename, paths);
print_graph(g, local, std::cout << " --- local name\n");
}
Prints (on my machine, where test.cpp
exists in /home/sehe/Projects/stackoverflow
):
---- vertex index
0 --> 1 7
1 --> 2
2 --> 3
3 --> 4
4 --> 5
5 --> 6
6 -->
7 --> 8
8 --> 9
9 --> 10
10 --> 11
11 -->
--- node path
"/" --> "/root" "/home"
"/root" --> "/root/a"
"/root/a" --> "/root/a/a"
"/root/a/a" --> "/root/a/a/a"
"/root/a/a/a" --> "/root/a/a/a/a"
"/root/a/a/a/a" --> "/root/a/a/a/a/a"
"/root/a/a/a/a/a" -->
"/home" --> "/home/sehe"
"/home/sehe" --> "/home/sehe/Projects"
"/home/sehe/Projects" --> "/home/sehe/Projects/stackoverflow"
"/home/sehe/Projects/stackoverflow" --> "/home/sehe/Projects/stackoverflow/test.cpp"
"/home/sehe/Projects/stackoverflow/test.cpp" -->
--- local name
"" --> "root" "home"
"root" --> "a"
"a" --> "a"
"a" --> "a"
"a" --> "a"
"a" --> "a"
"a" -->
"home" --> "sehe"
"sehe" --> "Projects"
"Projects" --> "stackoverflow"
"stackoverflow" --> "test.cpp"
"test.cpp" -->
BONUS
Graphviz output:
write_graphviz(std::cout, g, boost::label_writer{local});
Gives this graphviz
¹ see e.g. How to configure boost::graph to use my own (stable) index for vertices?
Boost Graph Library: edge insertion slow for large graph
adjacency_list
is very general purpose; unfortunately it's never going to be as efficient as a solution exploiting the regularity of your particular use-case could be. BGL isn't magic.
Your best bet is probably to come up with the efficient graph representation you'd use in the absence of BGL (hint: for a graph of an image's neighbouring pixels, this is not going to explicitly allocate all those node and edge objects) and then fit BGL to it (example), or equivalently just directly implement a counterpart to the existing adjacency_list
/ adjacency_matrix
templates (concept guidelines) tuned to the regularities of your system.
By an optimised representation, I of course mean one in which you don't actually store all the nodes and edges explicitly but just have some way of iterating over enumerations of the implicit nodes and edges arising from the fact that the image is a particular size. The only thing you should really need to store is an array of edge weights.
Related Topics
How to Alter Qt Widgets in Winapi Threads
Error Redeclaring a for Loop Variable Within the Loop
Differencebetween If (Null == Pointer) VS If (Pointer == Null)
Why Cast to a Pointer Then Dereference
Com(C++) Programming Tutorials
Compiler Cannot Recognize My Class in C++ - Cyclic Dependency
Convert String to Integer in C++
Template Class with Template Container
Specifying a Concept for a Type That Has a Member Function Template Using Concepts Lite
Why C++11 Compiler Support Still Requires a Flag
Appending Std::Vector to Itself, Undefined Behavior
The Proper Way of Forcing a 32-Bit Compile Using Cmake
Throwing the Fattest People Off of an Overloaded Airplane
How to Simulate Printf's %P Format When Using Std::Cout
C/C++ Inline Assembler with Instructions in String Variables