Iterating vs List Concatenation
So there are two ways to take a list and add the members of a second list to the first. You can use list concatenation or your can iterate over it. You can:
for obj in list2:
list1.append(obj)
or you can:
list1 = list1 + list2
or
list1 += list2
My question is: which is faster, and why? I tested this using two extremely large lists (upwards of 10000 objects) and it seemed the iterating method was a lot faster than the list concatenation (as in l1 = l1 + l2). Why is this? Can someone explain?
append
adds each item one at a time, which is the cause of its slowness, as well as the repeated function calls to append
.
However in this case the +=
operator is not syntactic sugar for the +
. The +=
operator does not actually create a new list then assign it back, it modifies the left hand operand in place. It's pretty apparent when using timeit
to use both 10,000 times.
>>> timeit.timeit(stmt="l = l + j", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)
0.5794978141784668
>>> timeit.timeit(stmt="l += j", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)
0.0013298988342285156
+=
is much faster (about 500x)
You also have the extend
method for lists which can append any iterable (not just another list) with something like l.extend(l2)
>>> timeit.timeit(stmt="l.extend(j)", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)
0.0016009807586669922
>>> timeit.timeit(stmt="for e in j: l.append(e)", setup="l=[1,2,3,4]; j = [5,6,7,8]", number=10000)
0.00805807113647461
Logically equivalent to appending, but much much faster as you can see.
So to explain this: iterating is faster than +
because +
has to construct an entire new list
extend
is faster than iteration because it's a builtin list method and has been optimized. Logically equivalent to appending repeatedly, but implemented differently.
+=
is faster than extend
because it can modify the list in place, knowing how much larger the list has to be and without repeated function calls. It assumes you're appending your list with another list/tuple
I ran the following code
l1 = list(range(0, 100000))
l2 = list(range(0, 100000))
def t1():
starttime = time.monotonic()
for item in l1:
l2.append(item)
print(time.monotonic() - starttime)
l1 = list(range(0, 100000))
l2 = list(range(0, 100000))
def t2():
starttime = time.monotonic()
global l1
l1 += l2
print(time.monotonic() - starttime)
and got this, which says that adding lists (+=) is faster.
0.016047026962041855
0.0019438499584794044
You're measuring wrong ; iterating and calling append
multiple times is way slower than doing it one call since the overhead of the many function call (at least in cpython) dwarfs anything that has to do with the actual list operation, as shown here with cPython 2.7.5 on Linux x64:
$ python -m timeit -s 'x = range(10000);y = range(10000)' 'for e in y:x.append(e)'
100 loops, best of 3: 2.56 msec per loop
$ python -m timeit -s 'x = range(10000);y = range(10000)' 'x = x + y'
100 loops, best of 3: 8.98 msec per loop
$ python -m timeit -s 'x = range(10000);y = range(10000)' 'x += y'
10000 loops, best of 3: 105 usec per loop
$ python -m timeit -s 'x = range(10000);y = range(10000)' 'x.extend(y)'
10000 loops, best of 3: 107 usec per loop
Note that x = x + y
creates a second copy of the list (at least in cPython). x.extend(y)
and its cousin x += y
do the same thing as calling append
multiple times, just without the overhead of actually calling a Python method.
上一篇: 了解* x,= lst
下一篇: 迭代vs List连接