Serializing a Class Which Contains a Std::String

Serializing a class which contains a std::string

I'm serializing by casting the class to a char* and writing it to a
file with fstream. Reading of course is just the reverse.

Unfortunately, this only works as long as there are no pointers involved. You might want to give your classes void MyClass::serialize(std::ostream) and void MyClass::deserialize(std::ifstream), and call those. For this case, you'd want

std::ostream& MyClass::serialize(std::ostream &out) const {
out << height;
out << ',' //number seperator
out << width;
out << ',' //number seperator
out << name.size(); //serialize size of string
out << ',' //number seperator
out << name; //serialize characters of string
return out;
}
std::istream& MyClass::deserialize(std::istream &in) {
if (in) {
int len=0;
char comma;
in >> height;
in >> comma; //read in the seperator
in >> width;
in >> comma; //read in the seperator
in >> len; //deserialize size of string
in >> comma; //read in the seperator
if (in && len) {
std::vector<char> tmp(len);
in.read(tmp.data() , len); //deserialize characters of string
name.assign(tmp.data(), len);
}
}
return in;
}

You may also want to overload the stream operators for easier use.

std::ostream &operator<<(std::ostream& out, const MyClass &obj)
{obj.serialize(out); return out;}
std::istream &operator>>(std::istream& in, MyClass &obj)
{obj.deserialize(in); return in;}

Serializing/ deserializing generic objects to string

One problem I can think of, is the fact that sizeof(testStruct) is actually almost equal to the size of each element type it contains, ie:

sizeof(int) + sizeof(double) + sizeof(std::string) + sizeof(char)

The resulting size does not contain the whole actual contents of the std::string object (which are "this is a test string"), but instead it results to something more like the size of the primitive data types that testStruct contains plus the size of the primitive data types that std:string contains (which should be a pointer to the contents and not the contents themselves).

So, because you pass your testStruct by value into the serialize method, it should be like copying the pointer to the "this is a test string" character sequence which is going to get freed upon destructing the contained std::string object upon the serialize method exit, while the same pointer is going to be freed also when the program finishes (because it is also declared in the main as the sender's std::string object).

You just wrapped your std::string into the testStruct struct (which has a destructor?), but this could probably occur if your just tested it with only a plain std::string object as your sender value.

It seems that this is what is happening, but I am not entirely sure, so correct me if I'm wrong in any assumption/hypothesis.

Edit 1:

As an answer to your first comment: you cannot use a char* for the text because this would mean that again the contents of the text would not be copied. Only the value of the actual pointer would be encoded. You could probably use a char array of fixed size (for example say char[100] or something) and store your text there, but that:

  1. Introduces the problem of having to know how much exactly is the maximum of characters you are going to store, and also that you have to allocate all of them even if not as many are needed for an instance of your testStruct.
  2. Does not change the fact that your binary representation is not going to be portable, because of the size of each primitive data type being different accross architectures.

I would suggest you to have a look at the following link:

https://isocpp.org/wiki/faq/serialization

and read it a little. It might help.

If you want text representation (such as XML, or even plain custom-structured text) remember that it should again be non-portable because for example if you print a number of type int with the value (for example) 2^30 then in another architecture where the int is of 2 bytes size, the value would overflow. But maybe you can get around this with cstdint types (such as int_least32_t and so on), but then you have to encode all your data into numbers, which might not seem too hard, and it probably isn't, but that includes text which would mean using a specific text encoding table, ie ASCII, UTF-8, etc...

You can also have a look at another post about serializing an object in C++.

If you want binary representation and portability and easy implementation, I would suggest to simply change your language to Java if at all possible and then have a look at the Serializable interface and its rules. The difference with Java is that it has primitive data types of fixed size and that does not depend on the architecture, as far as I know.

If you only want to copy complex objects accross the same process, use copy constructors.

Can boost::container::strings be serialized using boost serialization?

Yes. Surprisingly, the necessary support is not baked into Boost. Though if you look inside the string serialization header you will find that it has support as "primitive", and it takes just one line to enable it:

BOOST_CLASS_IMPLEMENTATION(boost::container::string, boost::serialization::primitive_type)

Now it works the same as std::string:

Live On Coliru

#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/container/string.hpp>
#include <iostream>

BOOST_CLASS_IMPLEMENTATION(boost::container::string, boost::serialization::primitive_type)

struct car {
template<class Ar> void serialize(Ar& ar, unsigned) { ar & make; }
boost::container::string make;
};

int main() {
std::stringstream ss;
{
boost::archive::text_oarchive oa(ss);
car my_car{"ford"};
oa << my_car;
} // close archive

std::cout << ss.str() << "\n";

boost::archive::text_iarchive ia(ss);
car new_car;
ia >> new_car;
}

Prints

22 serialization::archive 17 0 0 ford

How to serialize a self-defined class containing a STL container of another self-defined class using boost serialization easily?

The answer is yes. Here's a complete working example. I invented some simple types:

struct InstanceIdentity { int id;            };
struct ClassLabel { std::string value; };
struct FeatureGroup { size_t count; };

With non-intrusive serialization:

namespace boost { namespace serialization {
template <typename Ar>
void serialize(Ar& ar, InstanceIdentity& ii, unsigned) { ar & ii.id; }
template <typename Ar>
void serialize(Ar& ar, ClassLabel& cl, unsigned) { ar & cl.value; }
template <typename Ar>
void serialize(Ar& ar, FeatureGroup& fg, unsigned) { ar & fg.count; }
} }

The Instance and InstanceManager classes have their serialize-methods as a member (due the the private members):

template <typename Ar>
void Instance::serialize(Ar& ar, unsigned) { ar & identity_ & class_label_ & features_; }

template <typename Ar>
void InstanceManager::serialize(Ar& ar, unsigned) { ar & instances_; }

Live On Coliru

#include <boost/archive/text_oarchive.hpp>
#include <boost/archive/text_iarchive.hpp>
#include <boost/serialization/set.hpp>
#include <boost/serialization/serialization.hpp>
#include <boost/serialization/access.hpp>

struct InstanceIdentity { int id; };
struct ClassLabel { std::string value; };
struct FeatureGroup { size_t count; };

class Instance
{
public:
Instance(){}
~Instance() = default;
//... a lot of functions here.

bool operator<(Instance const& other) const {
return identity_.id < other.identity_.id;
}
private:
InstanceIdentity identity_;
mutable ClassLabel class_label_;
mutable FeatureGroup features_;
friend class InstanceManager;

friend class boost::serialization::access;
template <typename Ar>
void serialize(Ar& ar, unsigned) { ar & identity_ & class_label_ & features_; }
};

class InstanceManager
{
public:
InstanceManager()
{
Instance a, b, c;

a.class_label_.value = "label a";
b.class_label_.value = "label b";
c.class_label_.value = "label c";

a.features_.count = 42;
b.features_.count = 24;
c.features_.count = -9;

a.identity_.id = rand();
b.identity_.id = rand();
c.identity_.id = rand();

instances_.insert(instances_.end(), a);
instances_.insert(instances_.end(), b);
instances_.insert(instances_.end(), c);
}

~InstanceManager() = default;
// a lot of functions here.
private:
std::set<Instance> instances_;

friend class boost::serialization::access;
template <typename Ar>
void serialize(Ar& ar, unsigned) { ar & instances_; }
};

namespace boost { namespace serialization {
template <typename Ar>
void serialize(Ar& ar, InstanceIdentity& ii, unsigned) { ar & ii.id; }
template <typename Ar>
void serialize(Ar& ar, ClassLabel& cl, unsigned) { ar & cl.value; }
template <typename Ar>
void serialize(Ar& ar, FeatureGroup& fg, unsigned) { ar & fg.count; }
} }

int main() {

InstanceManager im;

boost::archive::text_oarchive oa(std::cout);
oa << im;

}

Prints

22 serialization::archive 12 0 0 0 0 3 0 0 0 0 0 846930886 0 0 7 label b 0 0 24 1681692777 7 label c 18446744073709551607 1804289383 7 label a 42



Related Topics



Leave a reply



Submit