PyPy: Severe performance penalty when using None in a list with integers

2018-07-04 16:57:52

Because an algorithm I want to implement uses indices 1..n and because it's very error prone to shift every index by one, I decided to get smart and inserted a dummy element in the beginning of every list, so I can use original formulas from the paper.

For the sake of shortness, consider this toy example:

def calc(N):
    nums=[0]+range(1,N+1)
    return sum(nums[1:]) #skip first element

However, I got worried, that my results are spurious, because I could access the 0-th element by accident somewhere and not be aware of it. So I got even smarter and used None instead of 0 as the first element - every arithmetical operation with it would result in a runtime error:

def calc_safe(N):
    nums=[None]+range(1,N+1) #here we use "None"
    return sum(nums[1:])

Surprisingly, this small change led to a huge performance penalty for pypy (even with the current 5.8-version) - the code became around 10 times slower! Here are the times on my machine:

                    pypy-5.8    cpython
calc(10**8)         0.5 sec     5.5 sec
calc_safe(10**8)    7.5 sec     5.5 sec

As a side node: Cpython does not care, whether None is used or not.

So my question is twofold:

Obviously using None is not a good idea, but why?

Is it possible to get the safety of the None -approach and keep the performance?

Edit: As Armin has explained, not all lists are equal, and we can see, which strategy is used via:

import __pypy__ 
print __pypy__.strategy(nums)

In the first case it is IntegerListStrategy and in the second ObjectListStrategy . The same would happen if we used a large integer value (like 2**100 ) instead of None .

PyPy has got a special case for lists containing only integers---it stores them like an array.array . If there is a None in it, then this optimization no longer works.

This could probably be fixed inside PyPy to allow None as a special case...

链接地址: http://www.djcxy.com/p/96634.html

上一篇: 在什么情况下，PHP 7的自我指向基类？

下一篇: PyPy：在整数列表中使用None时性能会受到严重影响