Optimization of importing modules in Python

I am reading David Beazley's Python Reference book and he makes a point:

For example, if you were performing a lot of square root operations, it is faster to use 'from math import sqrt' and 'sqrt(x)' rather than typing 'math.sqrt(x)'.

and:

For calculations involving heavy use of methods or module lookups, it is almost always better to eliminate the attribute lookup by putting the operation you want to perform into a local variable first.

I decided to try it out:

first()

def first():
    from collections import defaultdict
    x = defaultdict(list)

second()

def second():
    import collections
    x = collections.defaultdict(list)

The results were:

2.15461492538
1.39850616455

Optimizations such as these probably don't matter to me. But I am curious as to why the opposite of what Beazley has written comes out to be true. And note that there is a difference of 1 second, which is singificant given the task is trivial.

Why is this happening?

UPDATE:

I am getting the timings like:

print timeit('first()', 'from __main__ import first');
print timeit('second()', 'from __main__ import second');

The from collections import defaultdict and import collections should be outside the iterated timing loops, since you won't repeat doing them.

I guess that the from syntax has to do more work that the import syntax.

Using this test code:

#!/usr/bin/env python

import timeit

from collections import defaultdict
import collections

def first():
    from collections import defaultdict
    x = defaultdict(list)

def firstwithout():
    x = defaultdict(list)

def second():
    import collections
    x = collections.defaultdict(list)

def secondwithout():
    x = collections.defaultdict(list)

print "first with import",timeit.timeit('first()', 'from __main__ import first');
print "second with import",timeit.timeit('second()', 'from __main__ import second');

print "first without import",timeit.timeit('firstwithout()', 'from __main__ import firstwithout');
print "second without import",timeit.timeit('secondwithout()', 'from __main__ import secondwithout');

I get results:

first with import 1.61359190941
second with import 1.02904295921
first without import 0.344709157944
second without import 0.449721097946

Which shows how much the repeated imports cost.


I'll get also similar ratios between first(.) and second(.) , only difference is that the timings are in microsecond level.

I don't think that your timings measure anything useful. Try to figure out better test cases!

Update:
FWIW, here is some tests to support David Beazley's point.

import math
from math import sqrt

def first(n= 1000):
    for k in xrange(n):
        x= math.sqrt(9)

def second(n= 1000):
    for k in xrange(n):
        x= sqrt(9)

In []: %timeit first()
1000 loops, best of 3: 266 us per loop
In [: %timeit second()
1000 loops, best of 3: 221 us per loop
In []: 266./ 221
Out[]: 1.2036199095022624

So first() is some 20% slower than second() .


first() doesn't save anything, since the module must still be accessed in order to import the name.

Also, you don't give your timing methodology but given the function names it seems that first() performs the initial import, which is always longer than subsequent imports since the module must be compiled and executed.

链接地址: http://www.djcxy.com/p/39640.html

上一篇: 什么是减少C语言内存使用的最佳做法?

下一篇: Python中导入模块的优化