Numpy array access optimization
So, I have a large numpy array (over 10 million elements or even more) Then I have a loop, which accesses the large array.
In each iteration, the loop goes through a list of neighbouring indexes, retrieving values from the numpy array. Currently, I take a slice of the large array, and then retrieve values.
For example: the loop needs to access [1000, 1000], [1001, 1000], [1002,999]
it will take the slice of array[1000:1003, 999 : 1001]
, and then access the elements.
Is this lowering the performance of the loop, or increasing it (in theory)?
From what I remember,slicing is done copying the list part to a the memory. I'm not sure,but I'm almost sure that this operation is O(1). But direct acessing,like
container = [] for i in range(a,b): container.append(l[i])
It's usually faster.
arr1 = arr[1000:1003, 999 : 1001]
is a view
of arr
. That means it's a new array object, with its own shape and strides, but it shares the data buffer with arr
. (I could get into more details on how it 'shares' but I don't think that matters here.)
arr[1000, 1000]
, arr[1001, 1000]
, arr[1002,999]
are individual elements of arr
. arr1[0,1]
, arr1[1,1]
, arr1[2,0]
reference the same elements (if I've done the math right). My educated guess is that access time will be the same.
Those three elements could be fetched at once, with one copy, with
arr2 = arr[[1000, 1001, 1002],[1000, 1000, 999]]
I expect that
for x in arr:
<do something with x>
will be faster than
for idx in [[1000, 1000], [1001, 1000], [1002,999]]:
x = arr[idx]
<do something with x>
But it is likely that the 'do something' time will exceed the indexing time.
But I'd encourage you to set up a test case, and try the alternatives. See for yourself what makes a difference.
链接地址: http://www.djcxy.com/p/36366.html上一篇: java不可变类慢得多
下一篇: Numpy数组访问优化