How to make boost::serialization deserialization faster?

I use boost::serialization to save an object that contains this data :

struct Container
{
    struct SmallObject
    {
        struct CustomData
        {
            unsigned first;
            float second;
        };

        std::vector<CustomData> customData; // <- i can have 1 to 4 of these in the std::vector
        float data1[3];
        float data2[3];
        float data3[2];
        float data4[4];
    };

    std::vector<SmallObject> mySmallerObjects;  // <- i can have 8000 to 13000 of the std::vector
};

The serialization code looks like this (this in the intrusive version, I didn't write the functions declaration above for readability purposes) :

template<class Archive> void Container::SmallObject::CustomData::serialize(Archive& ar, unsigned /*version*/)
{
    ar & first;
    ar & second;
}

template<class Archive> void Container::SmallObject::serialize(Archive& ar, unsigned /*version*/)
{
    ar & customData;
    ar & data1
    ar & data2;
    ar & data3;
    ar & data4;
}

template<class Archive> void Container::serialize(Archive& ar, unsigned /*version*/)
{
    ar & mySmallerObjects;
}

I use binary_archives. In release mode, loading my container (with 12000 small objects) takes about 400 milliseconds. I am told this is too long. Are there any settings or different memory layouts that would speed up the loading process ? Shall I giveup using boost::serialization ?


If I had to pick the single biggest drawback of Boost.Serialization, it would be poor performance. If 400ms is truly too slow, either get faster hardware or switch to a different serialization library.

That said, just in case you're doing something blatantly "wrong", you should post the serialization code for Container , Container::SmallObject , and Container::SmallObject::CustomData . You should also ensure that it's actually deserialization that's taking 400ms, and not a combination of deserializing + reading the data from the disk; ie, load the data into a memory-stream of some sort and deserialize from that, rather than deserializing from an std::fstream .


EDIT (in response to comments):

This code works for me using VC++ 2010 SP1 and Boost 1.47 beta:

double loadArchive(std::string const& archiveFileName, Container& data)
{
    std::ifstream fileStream(
        archiveFileName.c_str(),
        std::ios_base::binary | std::ios_base::in
    );
    std::stringstream buf(
        std::ios_base::binary | std::ios_base::in | std::ios_base::out
    );
    buf << fileStream.rdbuf();
    fileStream.close();

    StartCounter();
    boost::archive::binary_iarchive(buf) >> data;
    return GetCounter();
}

If this doesn't work for you, it must be specific to the compiler and/or version of Boost you're using (which are what?).

On my machine, for an x86 release build (with link-time code generation enabled), loading the data from disk is ~9% of the overall time taken to deserialize a 1.28MB file (1 Container containing 13000 SmallObject instances, each containing 4 CustomData instances); for an x64 release build, loading the data from disk is ~17% of the overall time taken to deserialize a 1.53MB file (same object counts).


I'd suggest writing the number of items into the serialization stream and then using std::vector::reserve to allocate all the memory you will need. That way, you will be doing the minimum number of allocations.

链接地址: http://www.djcxy.com/p/7970.html

上一篇: 如何确定他们正在上传某些文件

下一篇: 如何使boost :: serialization反序列化更快?