C++ Serialization Performance
I would strongly suggest protocol buffers. They're incredibly simple to use, offer great performance, and take care of issues like endianness and backwards compatibility. To make it even more attractive, serialized data is language-independent thanks to numerous language implementations.
Fastest way to serialize and deserialize .NET objects
Here's your model (with invented CT
and TE
) using protobuf-net (yet retaining the ability to use XmlSerializer
, which can be useful - in particular for migration); I humbly submit (with lots of evidence if you need it) that this is the fastest (or certainly one of the fastest) general purpose serializer in .NET.
If you need strings, just base-64 encode the binary.
[XmlType]
public class CT {
[XmlElement(Order = 1)]
public int Foo { get; set; }
}
[XmlType]
public class TE {
[XmlElement(Order = 1)]
public int Bar { get; set; }
}
[XmlType]
public class TD {
[XmlElement(Order=1)]
public List<CT> CTs { get; set; }
[XmlElement(Order=2)]
public List<TE> TEs { get; set; }
[XmlElement(Order = 3)]
public string Code { get; set; }
[XmlElement(Order = 4)]
public string Message { get; set; }
[XmlElement(Order = 5)]
public DateTime StartDate { get; set; }
[XmlElement(Order = 6)]
public DateTime EndDate { get; set; }
public static byte[] Serialize(List<TD> tData) {
using (var ms = new MemoryStream()) {
ProtoBuf.Serializer.Serialize(ms, tData);
return ms.ToArray();
}
}
public static List<TD> Deserialize(byte[] tData) {
using (var ms = new MemoryStream(tData)) {
return ProtoBuf.Serializer.Deserialize<List<TD>>(ms);
}
}
}
Improve Binary Serialization Performance for large List of structs
Binary serialisation using BinaryFormatter
includes type information in the bytes it generates. This takes up additional space. It's useful in cases where you don't know what structure of data to expect at the other end, for example.
In your case, you know what format the data has at both ends, and that doesn't sound like it'd change. So you can write a simple encode and decode method. Your CoOrd class no longer needs to be serializable too.
I would use System.IO.BinaryReader and System.IO.BinaryWriter, then loop through each of your CoOrd instances and read/write the X,Y,Z propery values to the stream. Those classes will even pack your ints into less than 11MB, assuming many of your numbers are smaller than 0x7F and 0x7FFF.
Something like this:
using (var writer = new BinaryWriter(stream)) {
// write the number of items so we know how many to read out
writer.Write(points.Count);
// write three ints per point
foreach (var point in points) {
writer.Write(point.X);
writer.Write(point.Y);
writer.Write(point.Z);
}
}
To read from the stream:
List<CoOrd> points;
using (var reader = new BinaryReader(stream)) {
var count = reader.ReadInt32();
points = new List<CoOrd>(count);
for (int i = 0; i < count; i++) {
var x = reader.ReadInt32();
var y = reader.ReadInt32();
var z = reader.ReadInt32();
points.Add(new CoOrd(x, y, z));
}
}
the fastest way to load data in C++
- speed comparisons here (how to do performance test using the boost library for a custom library)
- size trade-offs Boost C++ Serialization overhead (also with compression)
- EOS Portable Archive (EPA) for portable binary archives
That said, deserialization can be slow, depending on the types deserialized.
Speed depends on a lot of factors, quite possibly unrelated to the serialization library used.
- Some data structures have costly insertion performance characteristics (see if you can reserve capacity/load with hints etc)
- you might have a lot of dynamic allocation (consider trying e.g. Boost's
flat_map
for contiguous storage, or load unsorted and sort data when load is completed etc.) - you might have non-inlined (virtual) dispatching - prefer loading/store POD types in simple containers
You will have to profile your code to find out what is the performance bottleneck.
Performance Tests of Serializations used by WCF Bindings
OK; I'll bite... here's some raw serializer metrics (emph: you may need to consider base-64/MTOM to get overall bandwidth requirements, plus whatever fixed overheads (both space and CPU) that WCF adds), however; results first:
BinaryFormatter
Length: 1314
Serialize: 6746
Deserialize: 6268
XmlSerializer
Length: 1049
Serialize: 3282
Deserialize: 5132
DataContractSerializer
Length: 911
Serialize: 1411
Deserialize: 4380
NetDataContractSerializer
Length: 1139
Serialize: 2014
Deserialize: 5645
JavaScriptSerializer
Length: 528
Serialize: 12050
Deserialize: 30558
(protobuf-net v2)
Length: 112
Serialize: 217
Deserialize: 250
(so I conclude protobuf-net v2 the winner...)
Numbers updated with .NET 4.5 and current library builds, on a newer machine:
BinaryFormatter
Length: 1313
Serialize: 2786
Deserialize: 2407
XmlSerializer
Length: 1049
Serialize: 1265
Deserialize: 2165
DataContractSerializer
Length: 911
Serialize: 574
Deserialize: 2011
NetDataContractSerializer
Length: 1138
Serialize: 850
Deserialize: 2535
JavaScriptSerializer
Length: 528
Serialize: 8660
Deserialize: 8468
(protobuf-net v2)
Length: 112
Serialize: 78
Deserialize: 134
with test rig (compiled with optimizations, run at command line):
(and note I had to invent the Player
class and some sample data):
using System;
using System.Diagnostics;
using System.IO;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Formatters.Binary;
using System.Text;
using System.Web.Script.Serialization;
using System.Xml.Serialization;
using ProtoBuf.Meta;
static class Program
{
static void Main()
{
var orig = new Game {
Finished = true, GameGUID = Guid.NewGuid(), GameID = 12345, GameSetup = false, MaximumCardsInDeck = 20,
Player = new Player { Name = "Fred"}, Player1 = new Player { Name = "Barney"}, Player1Connected = true,
Player1EnvironmentSetup = true, Player1ID = 12345, Player1Won = 3, Player2Connected = true, Player2EnvironmentSetup = true,
Player2ID = 23456, Player2Won = 0, Round = 4, RoundsToWin = 5, Started = true, StateXML = "not really xml",
TimeEnded = null, TimeLimitPerTurn = 500, TimeStamp = new byte[] {1,2,3,4,5,6}, TimeStarted = DateTime.Today};
const int LOOP = 50000;
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new BinaryFormatter();
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms);
var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new XmlSerializer(typeof(Game));
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms);
var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new DataContractSerializer(typeof(Game));
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.WriteObject(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.ReadObject(ms);
var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.WriteObject(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.ReadObject(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = new NetDataContractSerializer();
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms);
var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms);
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
{
var sb = new StringBuilder();
var ser = new JavaScriptSerializer();
Console.WriteLine();
Console.WriteLine(ser.GetType().Name);
ser.Serialize(orig, sb);
Console.WriteLine("Length: " + sb.Length);
ser.Deserialize(sb.ToString(), typeof(Game));
var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
sb.Length = 0;
ser.Serialize(orig, sb);
}
watch.Stop();
string s = sb.ToString();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ser.Deserialize(s, typeof(Game));
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}
GC.Collect(GC.MaxGeneration, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();
using (var ms = new MemoryStream())
{
var ser = CreateProto();
Console.WriteLine();
Console.WriteLine("(protobuf-net v2)");
ser.Serialize(ms, orig);
Console.WriteLine("Length: " + ms.Length);
ms.Position = 0;
ser.Deserialize(ms, null, typeof(Game));
var watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ms.SetLength(0);
ser.Serialize(ms, orig);
}
watch.Stop();
Console.WriteLine("Serialize: " + watch.ElapsedMilliseconds);
watch = Stopwatch.StartNew();
for (int i = 0; i < LOOP; i++)
{
ms.Position = 0;
ser.Deserialize(ms, null, typeof(Game));
}
watch.Stop();
Console.WriteLine("Deserialize: " + watch.ElapsedMilliseconds);
}
Console.WriteLine();
Console.WriteLine("All done; any key to exit");
Console.ReadKey();
}
static TypeModel CreateProto()
{
var meta = TypeModel.Create();
meta.Add(typeof(Game), false).Add(Array.ConvertAll(typeof(Game).GetProperties(),prop=>prop.Name));
meta.Add(typeof(Player), false).Add(Array.ConvertAll(typeof(Player).GetProperties(),prop=>prop.Name));
return meta.Compile();
}
}
[Serializable, DataContract]
public partial class Game
{
[DataMember]
public bool Finished { get; set; }
[DataMember]
public Guid GameGUID { get; set; }
[DataMember]
public long GameID { get; set; }
[DataMember]
public bool GameSetup { get; set; }
[DataMember]
public Nullable<int> MaximumCardsInDeck { get; set; }
[DataMember]
public Player Player { get; set; }
[DataMember]
public Player Player1 { get; set; }
[DataMember]
public bool Player1Connected { get; set; }
[DataMember]
public bool Player1EnvironmentSetup { get; set; }
[DataMember]
public long Player1ID { get; set; }
[DataMember]
public int Player1Won { get; set; }
[DataMember]
public bool Player2Connected { get; set; }
[DataMember]
public bool Player2EnvironmentSetup { get; set; }
[DataMember]
public long Player2ID { get; set; }
[DataMember]
public int Player2Won { get; set; }
[DataMember]
public int Round { get; set; }
[DataMember]
public Nullable<int> RoundsToWin { get; set; }
[DataMember]
public bool Started { get; set; }
[DataMember]
public string StateXML { get; set; }
[DataMember]
public Nullable<DateTime> TimeEnded { get; set; }
[DataMember]
public Nullable<int> TimeLimitPerTurn { get; set; }
[DataMember]
public byte[] TimeStamp { get; set; }
[DataMember]
public Nullable<DateTime> TimeStarted { get; set; }
}
[Serializable, DataContract]
public class Player
{
[DataMember]
public string Name { get; set; }
}
Related Topics
Differences Between Running an Executable with Visual Studio Debugger VS Without Debugger
How to Get Cmake to Recognize Pthread on Ubuntu
How Do Static Variables in Lambda Function Objects Work
Disabling G++'s Return-Value Optimisation
Passing Reference to Stl Vector Over Dll Boundary
How to Have Polymorphic Containers with Value Semantics in C++
Incompatible with Parameter of Type "Lpcwstr"
In the Standard, What Is "Derived-Declarator-Type"
When Pass-By-Pointer Is Preferred to Pass-By-Reference in C++
C++11 Memory_Order_Acquire and Memory_Order_Release Semantics
What Does the "L" Mean at the End of an Integer Literal
Type Erasure in C++: How Boost::Shared_Ptr and Boost::Function Work