Numpy array access optimization

2018-06-12 17:26:08

So, I have a large numpy array (over 10 million elements or even more) Then I have a loop, which accesses the large array.

In each iteration, the loop goes through a list of neighbouring indexes, retrieving values from the numpy array. Currently, I take a slice of the large array, and then retrieve values.

For example: the loop needs to access [1000, 1000], [1001, 1000], [1002,999] it will take the slice of array[1000:1003, 999 : 1001] , and then access the elements.

Is this lowering the performance of the loop, or increasing it (in theory)?

From what I remember,slicing is done copying the list part to a the memory. I'm not sure,but I'm almost sure that this operation is O(1). But direct acessing,like

container = [] for i in range(a,b): container.append(l[i])

It's usually faster.

arr1 = arr[1000:1003, 999 : 1001] is a view of arr . That means it's a new array object, with its own shape and strides, but it shares the data buffer with arr . (I could get into more details on how it 'shares' but I don't think that matters here.)

arr[1000, 1000] , arr[1001, 1000] , arr[1002,999] are individual elements of arr . arr1[0,1] , arr1[1,1] , arr1[2,0] reference the same elements (if I've done the math right). My educated guess is that access time will be the same.

Those three elements could be fetched at once, with one copy, with

arr2 = arr[[1000, 1001, 1002],[1000, 1000, 999]]

I expect that

for x in arr:
   <do something with x>

will be faster than

for idx in [[1000, 1000], [1001, 1000], [1002,999]]:
    x = arr[idx]
    <do something with x>

But it is likely that the 'do something' time will exceed the indexing time.

But I'd encourage you to set up a test case, and try the alternatives. See for yourself what makes a difference.

链接地址: http://www.djcxy.com/p/36366.html

上一篇: java不可变类慢得多

下一篇: Numpy数组访问优化