在numpy数组中一次访问块
提供了一个numpy数组:
Provided a numpy array:
arr = np.array([0,1,2,3,4,5,6,7,8,9,10,11,12])
我想知道如何访问选定大小的块以及选定的分隔,包括串联和切片:
I wonder how access chosen size chunks with chosen separation, both concatenated and in slices:
例如:获取大小为3的块,并用两个值分隔:
E.g.: obtain chunks of size 3 separated by two values:
arr_chunk_3_sep_2 = np.array([0,1,2,5,6,7,10,11,12])
arr_chunk_3_sep_2_in_slices = np.array([[0,1,2],[5,6,7],[10,11,12])
哪种方法最有效?如果可能的话,我想尽可能避免复制或创建新对象.也许 Memoryviews 在这里可能有帮助?
Wha is the most efficient way to do it? If possible, I would like to avoid copying or creating new objects as much as possible. Maybe Memoryviews could be of help here?
方法1
这里是masking
-
def slice_grps(a, chunk, sep):
N = chunk + sep
return a[np.arange(len(a))%N < chunk]
样品运行-
In [223]: arr
Out[223]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
In [224]: slice_grps(arr, chunk=3, sep=2)
Out[224]: array([ 0, 1, 2, 5, 6, 7, 10, 11, 12])
方法2
如果输入数组使得最后一块具有足够的跑道,我们可以利用 np.lib.stride_tricks.as_strided
,受 this post
的启发,选择m
每个n
元素块中的所有元素-
If the input array is such that the last chunk would have enough runway, we could , we could leverage np.lib.stride_tricks.as_strided
, inspired by this post
to select m
elements off each block of n
elements -
# https://stackoverflow.com/a/51640641/ @Divakar
def skipped_view(a, m, n):
s = a.strides[0]
strided = np.lib.stride_tricks.as_strided
shp = ((a.size+n-1)//n,n)
return strided(a,shape=shp,strides=(n*s,s), writeable=False)[:,:m]
out = skipped_view(arr,chunk,chunk+sep)
请注意,输出将是输入数组的视图,因此不会产生额外的内存开销,并且实际上是免费的!
Note that the output would be a view into the input array and as such no extra memory overhead and virtually free!
运行样本以使事情变得清晰-
Sample run to make things clear -
In [255]: arr
Out[255]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
In [256]: chunk = 3
In [257]: sep = 2
In [258]: skipped_view(arr,chunk,chunk+sep)
Out[258]:
array([[ 0, 1, 2],
[ 5, 6, 7],
[10, 11, 12]])
# Let's prove that the output is a view indeed
In [259]: np.shares_memory(arr, skipped_view(arr,chunk,chunk+sep))
Out[259]: True