C++ Serialization Performance

C++ Serialization Performance

I would strongly suggest protocol buffers. They're incredibly simple to use, offer great performance, and take care of issues like endianness and backwards compatibility. To make it even more attractive, serialized data is language-independent thanks to numerous language implementations.

Fastest way to serialize and deserialize .NET objects

Here's your model (with invented CT and TE) using protobuf-net (yet retaining the ability to use XmlSerializer, which can be useful - in particular for migration); I humbly submit (with lots of evidence if you need it) that this is the fastest (or certainly one of the fastest) general purpose serializer in .NET.

If you need strings, just base-64 encode the binary.

[XmlType]
public class CT {
[XmlElement(Order = 1)]
public int Foo { get; set; }
}
[XmlType]
public class TE {
[XmlElement(Order = 1)]
public int Bar { get; set; }
}
[XmlType]
public class TD {
[XmlElement(Order=1)]
public List<CT> CTs { get; set; }
[XmlElement(Order=2)]
public List<TE> TEs { get; set; }
[XmlElement(Order = 3)]
public string Code { get; set; }
[XmlElement(Order = 4)]
public string Message { get; set; }
[XmlElement(Order = 5)]
public DateTime StartDate { get; set; }
[XmlElement(Order = 6)]
public DateTime EndDate { get; set; }

public static byte[] Serialize(List<TD> tData) {
using (var ms = new MemoryStream()) {
ProtoBuf.Serializer.Serialize(ms, tData);
return ms.ToArray();
}
}

public static List<TD> Deserialize(byte[] tData) {
using (var ms = new MemoryStream(tData)) {
return ProtoBuf.Serializer.Deserialize<List<TD>>(ms);
}
}
}

Improve Binary Serialization Performance for large List of structs

Binary serialisation using BinaryFormatter includes type information in the bytes it generates. This takes up additional space. It's useful in cases where you don't know what structure of data to expect at the other end, for example.

In your case, you know what format the data has at both ends, and that doesn't sound like it'd change. So you can write a simple encode and decode method. Your CoOrd class no longer needs to be serializable too.

I would use System.IO.BinaryReader and System.IO.BinaryWriter, then loop through each of your CoOrd instances and read/write the X,Y,Z propery values to the stream. Those classes will even pack your ints into less than 11MB, assuming many of your numbers are smaller than 0x7F and 0x7FFF.

Something like this:

using (var writer = new BinaryWriter(stream)) {
// write the number of items so we know how many to read out
writer.Write(points.Count);
// write three ints per point
foreach (var point in points) {
writer.Write(point.X);
writer.Write(point.Y);
writer.Write(point.Z);
}
}

To read from the stream:

List<CoOrd> points;
using (var reader = new BinaryReader(stream)) {
var count = reader.ReadInt32();
points = new List<CoOrd>(count);
for (int i = 0; i < count; i++) {
var x = reader.ReadInt32();
var y = reader.ReadInt32();
var z = reader.ReadInt32();
points.Add(new CoOrd(x, y, z));
}
}

the fastest way to load data in C++

  1. speed comparisons here (how to do performance test using the boost library for a custom library)
  2. size trade-offs Boost C++ Serialization overhead (also with compression)
  3. EOS Portable Archive (EPA) for portable binary archives

That said, deserialization can be slow, depending on the types deserialized.
Speed depends on a lot of factors, quite possibly unrelated to the serialization library used.

  • Some data structures have costly insertion performance characteristics (see if you can reserve capacity/load with hints etc)
  • you might have a lot of dynamic allocation (consider trying e.g. Boost's flat_map for contiguous storage, or load unsorted and sort data when load is completed etc.)
  • you might have non-inlined (virtual) dispatching - prefer loading/store POD types in simple containers

You will have to profile your code to find out what is the performance bottleneck.

Performance Tests of Serializations used by WCF Bindings

OK; I'll bite... here's some raw serializer metrics (emph: you may need to consider base-64/MTOM to get overall bandwidth requirements, plus whatever fixed overheads (both space and CPU) that WCF adds), however; results first:

BinaryFormatter
Length: 1314
Serialize: 6746
Deserialize: 6268

XmlSerializer
Length: 1049
Serialize: 3282
Deserialize: 5132

DataContractSerializer
Length: 911
Serialize: 1411
Deserialize: 4380

NetDataContractSerializer
Length: 1139
Serialize: 2014
Deserialize: 5645

JavaScriptSerializer
Length: 528
Serialize: 12050
Deserialize: 30558

(protobuf-net v2)
Length: 112
Serialize: 217
Deserialize: 250

(so I conclude protobuf-net v2 the winner...)

Numbers updated with .NET 4.5 and current library builds, on a newer machine:

BinaryFormatter
Length: 1313
Serialize: 2786
Deserialize: 2407

XmlSerializer
Length: 1049
Serialize: 1265
Deserialize: 2165

DataContractSerializer
Length: 911
Serialize: 574
Deserialize: 2011

NetDataContractSerializer
Length: 1138
Serialize: 850
Deserialize: 2535

JavaScriptSerializer
Length: 528
Serialize: 8660
Deserialize: 8468

(protobuf-net v2)
Length: 112
Serialize: 78
Deserialize: 134

with test rig (compiled with optimizations, run at command line):

(and note I had to invent the Player class and some sample data):

using System;
using System.Diagnostics;
using System.IO;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.Web.Script.Serialization;
using System.Xml.Serialization;
using ProtoBuf.Meta;

static class Program
{
static void Main()
{
var orig = new Game {
Finished = true, GameGUID = Guid.NewGuid(), GameID = 12345, GameSetup = false, MaximumCardsInDeck = 20,
Player = new Player { Name = "Fred"}, Player1 = new Player { Name = "Barney"}, Player1Connected = true,
Player1EnvironmentSetup = true, Player1ID = 12345, Player1Won = 3, Player2Connected = true, Player2EnvironmentSetup = true,
Player2ID = 23456, Player2Won = 0, Round = 4, RoundsToWin = 5, Started = true, StateXML = "not really xml",
TimeEnded = null, TimeLimitPerTurn = 500, TimeStamp = new byte[] {1,2,3,4,5,6}, TimeStarted = DateTime.Today};
const int LOOP = 50000;

GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new BinaryFormatter();
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms);

var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}

GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new XmlSerializer(typeof(Game));
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms);

var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}

GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new DataContractSerializer(typeof(Game));
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.WriteObject(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.ReadObject(ms);

var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.WriteObject(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.ReadObject(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}

GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new NetDataContractSerializer();
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms);

var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}

GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
{
var sb = new StringBuilder();
var ser = new JavaScriptSerializer();
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(orig, sb);
Console.WriteLine("Length: " + sb.Length);
ser.Deserialize(sb.ToString(), typeof(Game));

var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
sb.Length = 0;
ser.Serialize(orig, sb);
}
watch.Stop();
string s = sb.ToString();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ser.Deserialize(s, typeof(Game));
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}

GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = CreateProto();
Console.WriteLine();
Console.WriteLine("(protobuf-net v2)");
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms, null, typeof(Game));

var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms, null, typeof(Game));
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}

Console.WriteLine();
Console.WriteLine("All done; any key to exit");
Console.ReadKey();
}
static TypeModel CreateProto()
{
var meta = TypeModel.Create();
meta.Add(typeof(Game), false).Add(Array.ConvertAll(typeof(Game).GetProperties(),prop=>prop.Name));
meta.Add(typeof(Player), false).Add(Array.ConvertAll(typeof(Player).GetProperties(),prop=>prop.Name));
return meta.Compile();
}
}

[Serializable, DataContract]
public partial class Game
{
[DataMember]
public bool Finished { get; set; }
[DataMember]
public Guid GameGUID { get; set; }
[DataMember]
public long GameID { get; set; }
[DataMember]
public bool GameSetup { get; set; }
[DataMember]
public Nullable<int> MaximumCardsInDeck { get; set; }
[DataMember]
public Player Player { get; set; }
[DataMember]
public Player Player1 { get; set; }
[DataMember]
public bool Player1Connected { get; set; }
[DataMember]
public bool Player1EnvironmentSetup { get; set; }
[DataMember]
public long Player1ID { get; set; }
[DataMember]
public int Player1Won { get; set; }
[DataMember]
public bool Player2Connected { get; set; }
[DataMember]
public bool Player2EnvironmentSetup { get; set; }
[DataMember]
public long Player2ID { get; set; }
[DataMember]
public int Player2Won { get; set; }
[DataMember]
public int Round { get; set; }
[DataMember]
public Nullable<int> RoundsToWin { get; set; }
[DataMember]
public bool Started { get; set; }
[DataMember]
public string StateXML { get; set; }
[DataMember]
public Nullable<DateTime> TimeEnded { get; set; }
[DataMember]
public Nullable<int> TimeLimitPerTurn { get; set; }
[DataMember]
public byte[] TimeStamp { get; set; }
[DataMember]
public Nullable<DateTime> TimeStarted { get; set; }
}
[Serializable, DataContract]
public class Player
{
[DataMember]
public string Name { get; set; }
}


Related Topics



Leave a reply



Submit