Speeding Up Python
This is really two questions, but they are so similar, and to keep it simple, I figured I'd just roll them together:
Firstly : Given an established python project, what are some decent ways to speed it up beyond just plain in-code optimization?
Secondly : When writing a program from scratch in python, what are some good ways to greatly improve performance?
For the first question, imagine you are handed a decently written project and you need to improve performance, but you can't seem to get much of a gain through refactoring/optimization. What would you do to speed it up in this case short of rewriting it in something like C?
Regarding "Secondly: When writing a program from scratch in python, what are some good ways to greatly improve performance?"
Remember the Jackson rules of optimization:
And the Knuth rule:
The more useful rules are in the General Rules for Optimization.
Don't optimize as you go. First get it right. Then get it fast. Optimizing a wrong program is still wrong.
Remember the 80/20 rule.
Always run "before" and "after" benchmarks. Otherwise, you won't know if you've found the 80%.
Use the right algorithms and data structures. This rule should be first. Nothing matters as much as algorithm and data structure.
Bottom Line
You can't prevent or avoid the "optimize this program" effort. It's part of the job. You have to plan for it and do it carefully, just like the design, code and test activities.
Rather than just punting to C, I'd suggest:
Make your code count. Do more with fewer executions of lines:
numpy
out. Twisted
framework. If all of the above fails for profiled and measured code, then begin thinking about the C-rewrite path.
The usual suspects -- profile it, find the most expensive line, figure out what it's doing, fix it. If you haven't done much profiling before, there could be some big fat quadratic loops or string duplication hiding behind otherwise innocuous-looking expressions.
In Python, two of the most common causes I've found for non-obvious slowdown are string concatenation and generators. Since Python's strings are immutable, doing something like this:
result = u""
for item in my_list:
result += unicode (item)
will copy the entire string twice per iteration. This has been well-covered, and the solution is to use "".join
:
result = "".join (unicode (item) for item in my_list)
Generators are another culprit. They're very easy to use and can simplify some tasks enormously, but a poorly-applied generator will be much slower than simply appending items to a list and returning the list.
Finally, don't be afraid to rewrite bits in C! Python, as a dynamic high-level language, is simply not capable of matching C's speed. If there's one function that you can't optimize any more in Python, consider extracting it to an extension module.
My favorite technique for this is to maintain both Python and C versions of a module. The Python version is written to be as clear and obvious as possible -- any bugs should be easy to diagnose and fix. Write your tests against this module. Then write the C version, and test it. Its behavior should in all cases equal that of the Python implementation -- if they differ, it should be very easy to figure out which is wrong and correct the problem.
链接地址: http://www.djcxy.com/p/39590.html上一篇: 我怎样才能加快我的Perl程序?
下一篇: 加速Python