从行索引,列索引和max(values)的np.array填充矩阵的快速方法

问题描述:

我有相当大的数组来填充矩阵(大约5e6个元素).我知道填充的快速方法是

I have quite large arrays to fill matrix (about 5e6 elements). I know the fast way to fill is something like

(简化示例)

bbb = (np.array([1,2,3,4,1])) # row
ccc = (np.array([0,1,2,1,0])) # column
ddd = (np.array([55.5,22.2,33.3,44.4,11.1])) # values

experiment = np.zeros(shape=(5,3))
experiment[bbb, ccc] = [ddd] # filling
>[[  0.    0.    0. ]
 [ 11.1   0.    0. ]
 [  0.   22.2   0. ]
 [  0.    0.   33.3]
 [  0.   44.4   0. ]]

但是如果我想要最大的ddd代替.类似于# filling

but if I want the max ddd instead. Something like at # filling

#pseudocode
experiment[bbb, ccc] = [ddd if ddd > experiment[bbb, ccc]]

矩阵应返回

>[[  0.    0.    0. ]
 [ 55.5   0.    0. ]
 [  0.   22.2   0. ]
 [  0.    0.   33.3]
 [  0.   44.4   0. ]]

在这里从np.array获取最大值以填充矩阵的快速方法是什么?

What is a good fast way to get max to fill the matrix from np.array here?

您可以使用 np.maximum .

You can use np.ufunc.at on np.maximum.

np.ufunc.at执行前面的ufunc无缓冲且就地".这意味着[bbb, ccc]中出现的所有索引都将由np.maximum处理,无论这些索引如何出现.

np.ufunc.at performs the preceding ufunc "unbuffered and in-place". This means all indices appearing in [bbb, ccc] will be processed by np.maximum, no matter how ofthen those indices appear.

在您的情况下,(0, 1)出现两次,因此它将被处理两次,每次选择最大的experiment[bbb, ccc]ddd.

In your case (0, 1) appears twice, so it will be processed twice, each time picking the maximum of experiment[bbb, ccc] and ddd.

np.maximum.at(experiment, [bbb, ccc], ddd)
# array([[  0. ,   0. ,   0. ],
#        [ 55.5,   0. ,   0. ],
#        [  0. ,  22.2,   0. ],
#        [  0. ,   0. ,  33.3],
#        [  0. ,  44.4,   0. ]])