averaging over subsets of array in numpy
I have a numpy array of the shape (10, 10, 10, 60). The dimensions could be arbitrary but this just an example.
I want to reduce this to an array of (10, 10, 10, 20)
by taking the mean over some subsets I have two scenarios:
1 : Take the mean of every (10, 10, 10, 20)
block ie have three (10, 10, 10, 20)
block and take the mean between the three. This can be done with: m = np.mean((x[..., :20], x[..., 20:40], x[...,40:60]), axis=3)
. My question is how can I generate this when the last dimension is arbitrary without writing some explicit loop? So, I can do something like:
x = np.random.rand(10, 10, 10, 60)
result = np.zeros((10, 10, 10, 20))
offset = 20
loops = x.shape[3] // offset
for i in range(loops):
index = i * offset
result += x[..., index:index+offset]
result = result / loops
However, this does not seem too pythonic and I was wondering if there is a more elegant way to do this.
2 : Another scenario is that I want to break it down into 10 arrays of the shape (10, 10, 10, 2, 3)
and then take the mean along the 5th dimension between these ten arrays and then reshape this to (10, 10, 10, 20)
array as original planned. I can reshape the array and then again take the average as done previously and reshape again but that second part seems quite inelegant.
You could reshape splitting the last axis into two, such that the first one has the length as the number of blocks needed and then get the average/mean along the second last axis -
m,n,r = x.shape[:3]
out = x.reshape(m,n,r,3,-1).mean(axis=-2) # 3 is no. of blocks
Alternatively, we could introduce np.einsum
for noticeable performance boost -
In [200]: x = np.random.rand(10, 10, 10, 60)
In [201]: %timeit x.reshape(m,n,r,3,-1).mean(axis=-2)
1000 loops, best of 3: 430 µs per loop
In [202]: %timeit np.einsum('ijklm->ijkm',x.reshape(m,n,r,3,-1))/3.0
1000 loops, best of 3: 214 µs per loop
链接地址: http://www.djcxy.com/p/68106.html
下一篇: 在numpy中对数组的子集进行平均