根据来自另一个数组的数据对一个numpy数组进行排序
我有两组数组data
和result
. result
在data
中包含相同的元素,但具有额外的列且未排序.我想重新排列result
数组,使其与data
中的行相同,同时在进行排序时将关联的值与该行的其余部分一起放入最后一列.
I have two sets of array data
and result
. result
contains the same elements in data
but with an extra column and in unsorted order. I want to rearrange the result
array so that it is in the same order as the rows in data
, while bringing the associated value into the last column with the rest of the row when doing the sorting.
data = np.array([[0,1,0,0],[1,0,0,0],[0,1,1,0],[0,1,0,1]])
result = np.array([[0,1,1,0,1],[1,0,0,0,0],[0,1,0,0,1],[0,1,0,1,0]])
# this is what the final sorted array should look like:
'''
array([[0, 1, 0, 0, 1],
[1, 0, 0, 0, 0],
[0, 1, 1, 0, 1],
[0, 1, 0, 1, 0]])
'''
我尝试执行argsort
以便将data
反转为排序顺序,然后将其应用于result
,但是argsort
似乎是根据每个元素对数组的顺序进行排序,而我想要排序以将data[:,4]
的每一行视为一个整体.
I've tried doing argsort
in order to reverse data
into the sorted order then applying that to result
but argsort
seems to sort the order of the array based on each element, whereas I want the sort to treat each row of the data[:,4]
as a whole.
ind = np.argsort(data)
indind =np.argsort(ind)
ind
array([[0, 2, 3, 1],
[1, 2, 3, 0],
[0, 3, 1, 2],
[0, 2, 1, 3]])
执行这种按行排序的好方法是什么?
What is a good way to do this kind of sorting by rows?
numpy_indexed 包(免责声明:我是它的作者)可以用来高效,优雅地解决这类问题:
The numpy_indexed package (disclaimer: I am its author) can be used to efficiently and elegantly solve these kind of problems:
import numpy_indexed as npi
result[npi.indices(result[:, :-1], data)]
npi.indices本质上是list.index的向量等效项;因此对于数据中的每个元素(行),我们都会得到结果中同一行的位置,减去最后一列.
npi.indices is essentially a vectorized equivalent of list.index; so for each element (row) in data, we get where that same row is located in result, minus the last column.
请注意,此解决方案适用于任意数量的列,并且已完全矢量化(即,任何地方都没有python循环).
Note that this solution works for any number of columns, and is fully vectorized (ie, no python loops anywhere).