一次为 numpy 数组的多个切片分配多个值
我有一个 numpy 数组,一个定义数组内范围的开始/结束索引列表,以及一个值列表,其中值的数量与范围的数量相同.在循环中执行此分配目前非常慢,因此我想以矢量化方式将值分配给数组中的相应范围.可以这样做吗?
I have a numpy array, a list of start/end indexes that define ranges within the array, and a list of values, where the number of values is the same as the number of ranges. Doing this assignment in a loop is currently very slow, so I'd like to assign the values to the corresponding ranges in the array in a vectorized way. Is this possible to do?
这是一个具体的简化示例:
Here's a concrete, simplified example:
a = np.zeros([10])
这是定义 a
范围内的开始和结束索引列表,如下所示:
Here's the list of start and a list of end indexes that define ranges within a
, like this:
starts = [0, 2, 4, 6]
ends = [2, 4, 6, 8]
这是我想分配给每个范围的值的列表:
And here's a list of values I'd like to assign to each range:
values = [1, 2, 3, 4]
我有两个问题.首先是我无法弄清楚如何同时使用多个切片对数组进行索引,因为范围列表是在实际代码中动态构建的.一旦我能够提取范围,我不确定如何一次分配多个值 - 每个范围一个值.
I have two problems. The first is that I can't figure out how to index into the array using multiple slices at the same time, since the list of ranges is constructed dynamically in the actual code. Once I'm able to extract the ranges, I'm not sure how to assign multiple values at once - one value per range.
以下是我尝试创建切片列表的方法以及在使用该列表索引数组时遇到的问题:
Here's how I've tried creating a list of slices and the problems I've run into when using that list to index into the array:
slices = [slice(start, end) for start, end in zip(starts, ends)]
In [97]: a[slices]
...
IndexError: too many indices for array
In [98]: a[np.r_[slices]]
...
IndexError: arrays used as indices must be of integer (or boolean) type
如果我使用静态列表,我可以一次提取多个切片,但是分配不能按我想要的方式工作:
If I use a static list, I can extract multiple slices at once, but then assignment doesn't work the way I want:
In [106]: a[np.r_[0:2, 2:4, 4:6, 6:8]] = [1, 2, 3]
/usr/local/bin/ipython:1: DeprecationWarning: assignment will raise an error in the future, most likely because your index result shape does not match the value array shape. You can use `arr.flat[index] = values` to keep the old behaviour.
#!/usr/local/opt/python/bin/python2.7
In [107]: a
Out[107]: array([ 1., 2., 3., 1., 2., 3., 1., 2., 0., 0.])
我真正想要的是:
np.array([1., 1., 2., 2., 3., 3., 4., 4., 0., 0.])
np.array([1., 1., 2., 2., 3., 3., 4., 4., 0., 0.])
这将以完全矢量化的方式解决问题:
This will do the trick in a fully vectorized manner:
counts = ends - starts
idx = np.ones(counts.sum(), dtype=np.int)
idx[np.cumsum(counts)[:-1]] -= counts[:-1]
idx = np.cumsum(idx) - 1 + np.repeat(starts, counts)
a[idx] = np.repeat(values, count)