How to get indices of N maximum values in a numpy array?

Numpy proposes a way to get the index of the maximum value of an array via np.argmax .

I would like a similar thing, but returning the indexes of the N maximum values.

For instance, if I have an array [1, 3, 2, 4, 5] , it function(array, n=3) would return [4, 3, 1] .

Thanks :)


The simplest I've been able to come up with is:

In [1]: import numpy as np

In [2]: arr = np.array([1, 3, 2, 4, 5])

In [3]: arr.argsort()[-3:][::-1]
Out[3]: array([4, 3, 1])

This involves a complete sort of the array. I wonder if numpy provides a built-in way to do a partial sort; so far I haven't been able to find one.

If this solution turns out to be too slow (especially for small n ), it may be worth looking at coding something up in Cython.


Newer NumPy versions (1.8 and up) have a function called argpartition for this. To get the indices of the four largest elements, do

>>> a
array([9, 4, 4, 3, 3, 9, 0, 4, 6, 0])
>>> ind = np.argpartition(a, -4)[-4:]
>>> ind
array([1, 5, 8, 0])
>>> a[ind]
array([4, 9, 6, 9])

Unlike argsort , this function runs in linear time in the worst case, but the returned indices are not sorted, as can be seen from the result of evaluating a[ind] . If you need that too, sort them afterwards:

>>> ind[np.argsort(a[ind])]
array([1, 8, 5, 0])

To get the top-k elements in sorted order in this way takes O(n + k log k) time.


EDIT: Modified to include Ashwini Chaudhary's improvement.

>>> import heapq
>>> import numpy
>>> a = numpy.array([1, 3, 2, 4, 5])
>>> heapq.nlargest(3, range(len(a)), a.take)
[4, 3, 1]

For regular Python lists:

>>> a = [1, 3, 2, 4, 5]
>>> heapq.nlargest(3, range(len(a)), a.__getitem__)
[4, 3, 1]

If you use Python 2, use xrange instead of range .

Source: http://docs.python.org/3/library/heapq.html

链接地址: http://www.djcxy.com/p/50986.html

上一篇: 如何将2D浮点numpy数组转换为2D int numpy数组?

下一篇: 如何在numpy数组中获取N个最大值的索引?