Python faster than C++? How does this happen?

This question already has an answer here:

  • Why is reading lines from stdin much slower in C++ than Python? 10 answers

  • There isn't anything obvious here. Since Python's written in C, it must use something like printf to implement print . C++ I/O Streams, like cout , are usually implemented in a way that's much slower than printf . If you want to put C++ on a better footing, you can try changing to:

    #include <cstdio>
    int main()
    {
        int x=0;
        while(x!=1000000)
        {
            ++x;
            std::printf("%dn", x);
        }
        return 0;
    }
    

    I did change to using ++x instead of x++ . Years ago people thought that this was a worthwhile "optimization." I will have a heart attack if that change makes any difference in your program's performance (OTOH, I am positive that using std::printf will make a huge difference in runtime performance). Instead, I made the change simply because you aren't paying attention to what the value of x was before you incremented it, so I think it's useful to say that in code.


    One of my colleague at work told me that Python code is faster than C++ code and then showed this topic as an example to prove his point. It is now obvious from other answers that what is wrong with the C++ code posted in the question. I still would like to summarize my benchmarks which I did in order to show him how fast a good C++ code can be!

    There are two problems with the original C++ code:

  • It uses std::endl to print a newline in each iteration. That is a very bad idea because std::endl does more stuff than simply printing a newline — it also forces the stream to flush the buffer accumulated so far; flushing is an expensive operation as it has to deal with hardware – the output device. So the first fix is this: if you want to print a newline, just use 'n' .

  • The second problem is less obvious as it is not seen in the code. It is in the design of C++ streams. By default, C++ streams are synchronized to the C streams after each input and output operation so that your application could mix std::cout and std::printf , and std::cin and std::scanf without any problem. This feature (yes, it is a feature) is not needed in this case so we can disable this, as it has a little runtime overhead (that is not a problem; it doesn't make C++ bad; it is simply a price for the feature). So the second fix is this: std::cout::sync_with_stdio(false);

  • And here is the final optimized code:

    #include <iostream>
    
    int main()
    {
        std::ios_base::sync_with_stdio(false); 
    
        int x = 0;
        while ( x != 1000000 )
        {
             ++x;
             std::cout << x << 'n';
        }
    }
    

    And compile this with -O3 flags and run (and measure ) as:

    $ g++ benchmark.cpp -O3    #compilation
    $ time ./a.out             #run
    
    //..
    
    real   0m32.175s
    user   0m0.088s
    sys    0m0.396s
    

    And run and measure python code (posted in the question):

    $ time ./benchmark.py
    
    //...
    
    real  0m35.714s
    user  0m3.048s
    sys   0m4.456s
    

    The user and sys time tell us which one is fast, and by what order .

    Hope that helps you to remove your doubts. :-)


    I think we need more information, but I would expect you're building an un-optimized build of C++. Try building it with the -O3 flag. (someone who knows GCC better will have more and better recommendations). However, here's some timings from a completely untrustworthy source: http://ideone.com. I ran each 5 times to get some measure of variance on the timing, but only the origonal C++ varied, and not much at that.

    Python: http://ideone.com/WBWB9 time: 0.07-0.07s
    Your C++: http://ideone.com/tzwQJ time: 0.05-0.06s
    Modified C++: http://ideone.com/pXJo3 time: 0.00s-0.00s

    As for why my C++ was faster than yours, std::endl forces C++ to flush the buffer immediately. 'n' does the newline without the forced buffer flush, which is much much much faster.

    (note: I only ran to 12773, since ideone.com kills processes after they display a certain amount of output, that was the most the server would give me)

    链接地址: http://www.djcxy.com/p/31508.html

    上一篇: 为什么scanf / printf比cin / cout更快?

    下一篇: Python比C ++更快? 这是如何发生的?