根据来自另一个数组的数据对一个numpy数组进行排序

问题描述：

我有两组数组data和result. result在data中包含相同的元素，但具有额外的列且未排序.我想重新排列result数组，使其与data中的行相同，同时在进行排序时将关联的值与该行的其余部分一起放入最后一列.

I have two sets of array data and result. result contains the same elements in data but with an extra column and in unsorted order. I want to rearrange the result array so that it is in the same order as the rows in data , while bringing the associated value into the last column with the rest of the row when doing the sorting.

data = np.array([[0,1,0,0],[1,0,0,0],[0,1,1,0],[0,1,0,1]])
result = np.array([[0,1,1,0,1],[1,0,0,0,0],[0,1,0,0,1],[0,1,0,1,0]])

# this is what the final sorted array should look like:
'''
array([[0, 1, 0, 0, 1],
       [1, 0, 0, 0, 0],
       [0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0]])
 '''

我尝试执行argsort以便将data反转为排序顺序，然后将其应用于result，但是argsort似乎是根据每个元素对数组的顺序进行排序，而我想要排序以将data[:,4]的每一行视为一个整体.

I've tried doing argsort in order to reverse data into the sorted order then applying that to result but argsort seems to sort the order of the array based on each element, whereas I want the sort to treat each row of the data[:,4] as a whole.

ind = np.argsort(data)
indind =np.argsort(ind)
ind
array([[0, 2, 3, 1],
   [1, 2, 3, 0],
   [0, 3, 1, 2],
   [0, 2, 1, 3]])

执行这种按行排序的好方法是什么?

What is a good way to do this kind of sorting by rows?

答

numpy_indexed 包(免责声明:我是它的作者)可以用来高效，优雅地解决这类问题:

The numpy_indexed package (disclaimer: I am its author) can be used to efficiently and elegantly solve these kind of problems:

import numpy_indexed as npi
result[npi.indices(result[:, :-1], data)]

npi.indices本质上是list.index的向量等效项；因此对于数据中的每个元素(行)，我们都会得到结果中同一行的位置，减去最后一列.

npi.indices is essentially a vectorized equivalent of list.index; so for each element (row) in data, we get where that same row is located in result, minus the last column.

请注意，此解决方案适用于任意数量的列，并且已完全矢量化(即，任何地方都没有python循环).

Note that this solution works for any number of columns, and is fully vectorized (ie, no python loops anywhere).

根据来自另一个数组的数据对一个numpy数组进行排序

相关推荐