Why do std::string operations perform poorly?

I made a test to compare string operations in several languages for choosing a language for the server-side application. The results seemed normal until I finally tried C++, which surprised me a lot. So I wonder if I had missed any optimization and come here for help.

The test are mainly intensive string operations, including concatenate and searching. The test is performed on Ubuntu 11.10 amd64, with GCC's version 4.6.1. The machine is Dell Optiplex 960, with 4G RAM, and Quad-core CPU.

in Python (2.7.2):

def test():
    x = ""
    limit = 102 * 1024
    while len(x) < limit:
        x += "X"
        if x.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) > 0:
            print("Oh my god, this is impossible!")
    print("x's length is : %d" % len(x))

test()

which gives result:

x's length is : 104448

real    0m8.799s
user    0m8.769s
sys     0m0.008s

in Java (OpenJDK-7):

public class test {
    public static void main(String[] args) {
        int x = 0;
        int limit = 102 * 1024;
        String s="";
        for (; s.length() < limit;) {
            s += "X";
            if (s.indexOf("ABCDEFGHIJKLMNOPQRSTUVWXYZ") > 0)
            System.out.printf("Find!n");
        }
        System.out.printf("x's length = %dn", s.length());
    }
}

which gives result:

x's length = 104448

real    0m50.436s
user    0m50.431s
sys     0m0.488s

in Javascript (Nodejs 0.6.3)

function test()
{
    var x = "";
    var limit = 102 * 1024;
    while (x.length < limit) {
        x += "X";
        if (x.indexOf("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) > 0)
            console.log("OK");
    }
    console.log("x's length = " + x.length);
}();

which gives result:

x's length = 104448

real    0m3.115s
user    0m3.084s
sys     0m0.048s

in C++ (g++ -Ofast)

It's not surprising that Nodejs performas better than Python or Java. But I expected libstdc++ would give much better performance than Nodejs, whose result really suprised me.

#include <iostream>
#include <string>
using namespace std;
void test()
{
    int x = 0;
    int limit = 102 * 1024;
    string s("");
    for (; s.size() < limit;) {
        s += "X";
        if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != string::npos)
            cout << "Find!" << endl;
    }
    cout << "x's length = " << s.size() << endl;
}

int main()
{
    test();
}

which gives result:

x length = 104448

real    0m5.905s
user    0m5.900s
sys     0m0.000s

Brief Summary

OK, now let's see the summary:

  • javascript on Nodejs(V8): 3.1s
  • Python on CPython 2.7.2 : 8.8s
  • C++ with libstdc++: 5.9s
  • Java on OpenJDK 7: 50.4s
  • Surprisingly! I tried "-O2, -O3" in C++ but noting helped. C++ seems about only 50% performance of javascript in V8, and even poor than CPython. Could anyone explain to me if I had missed some optimization in GCC or is this just the case? Thank you a lot.


    It's not that std::string performs poorly (as much as I dislike C++), it's that string handling is so heavily optimized for those other languages.

    Your comparisons of string performance are misleading, and presumptuous if they are intended to represent more than just that.

    I know for a fact that Python string objects are completely implemented in C, and indeed on Python 2.7, numerous optimizations exist due to the lack of separation between unicode strings and bytes. If you ran this test on Python 3.x you will find it considerably slower.

    Javascript has numerous heavily optimized implementations. It's to be expected that string handling is excellent here.

    Your Java result may be due to improper string handling, or some other poor case. I expect that a Java expert could step in and fix this test with a few changes.

    As for your C++ example, I'd expect performance to slightly exceed the Python version. It does the same operations, with less interpreter overhead. This is reflected in your results. Preceding the test with s.reserve(limit); would remove reallocation overhead.

    I'll repeat that you're only testing a single facet of the languages' implementations. The results for this test do not reflect the overall language speed.

    I've provided a C version to show how silly such pissing contests can be:

    #define _GNU_SOURCE
    #include <string.h>
    #include <stdio.h>
    
    void test()
    {
        int limit = 102 * 1024;
        char s[limit];
        size_t size = 0;
        while (size < limit) {
            s[size++] = 'X';
            if (memmem(s, size, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", 26)) {
                fprintf(stderr, "zomgn");
                return;
            }
        }
        printf("x's length = %zun", size);
    }
    
    int main()
    {
        test();
        return 0;
    }
    

    Timing:

    matt@stanley:~/Desktop$ time ./smash 
    x's length = 104448
    
    real    0m0.681s
    user    0m0.680s
    sys     0m0.000s
    

    So I went and played a bit with this on ideone.org.

    Here a slightly modified version of your original C++ program, but with the appending in the loop eliminated, so it only measures the call to std::string::find() . Note that I had to cut the number of iterations to ~40%, otherwise ideone.org would kill the process.

    #include <iostream>
    #include <string>
    
    int main()
    {
        const std::string::size_type limit = 42 * 1024;
        unsigned int found = 0;
    
        //std::string s;
        std::string s(limit, 'X');
        for (std::string::size_type i = 0; i < limit; ++i) {
            //s += 'X';
            if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != std::string::npos)
                ++found;
        }
    
        if(found > 0)
            std::cout << "Found " << found << " times!n";
        std::cout << "x's length = " << s.size() << 'n';
    
        return 0;
    }
    

    My results at ideone.org are time: 3.37s . (Of course, this is highly questionably, but indulge me for a moment and wait for the other result.)

    Now we take this code and swap the commented lines, to test appending, rather than finding. Note that, this time, I had increased the number of iterations tenfold in trying to see any time result at all.

    #include <iostream>
    #include <string>
    
    int main()
    {
        const std::string::size_type limit = 1020 * 1024;
        unsigned int found = 0;
    
        std::string s;
        //std::string s(limit, 'X');
        for (std::string::size_type i = 0; i < limit; ++i) {
            s += 'X';
            //if (s.find("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 0) != std::string::npos)
            //    ++found;
        }
    
        if(found > 0)
            std::cout << "Found " << found << " times!n";
        std::cout << "x's length = " << s.size() << 'n';
    
        return 0;
    }
    

    My results at ideone.org, despite the tenfold increase in iterations, are time: 0s .

    My conclusion: This benchmark is, in C++, highly dominated by the searching operation , the appending of the character in the loop has no influence on the result at all. Was that really your intention?


    The idiomatic C++ solution would be:

    #include <iostream>
    #include <string>
    #include <algorithm>
    
    int main()
    {
        const int limit = 102 * 1024;
        std::string s;
        s.reserve(limit);
    
        const std::string pattern("ABCDEFGHIJKLMNOPQRSTUVWXYZ");
    
        for (int i = 0; i < limit; ++i) {
            s += 'X';
            if (std::search(s.begin(), s.end(), pattern.begin(), pattern.end()) != s.end())
                std::cout << "Omg Wtf found!";
        }
        std::cout << "X's length = " << s.size();
        return 0;
    }
    

    I could speed this up considerably by putting the string on the stack, and using memmem -- but there seems to be no need. Running on my machine, this is over 10x the speed of the python solution already..

    [On my laptop]

    time ./test X's length = 104448 real 0m2.055s user 0m2.049s sys 0m0.001s

    链接地址: http://www.djcxy.com/p/31514.html

    上一篇: 用C ++快速分离花车?

    下一篇: 为什么std :: string操作表现不佳?