搜索具有多个值的Numpy数组
问题描述:
我有具有重复值的numpy 2d数组.
I have numpy 2d array having duplicate values.
我正在搜索这样的数组.
I am searching the array like this.
In [104]: import numpy as np
In [105]: array = np.array
In [106]: a = array([[1, 2, 3],
...: [1, 2, 3],
...: [2, 5, 6],
...: [3, 8, 9],
...: [4, 8, 9],
...: [4, 2, 3],
...: [5, 2, 3])
In [107]: num_list = [1, 4, 5]
In [108]: for i in num_list :
...: print(a[np.where(a[:,0] == num_list)])
...:
[[1 2 3]
[1 2 3]]
[[4 8 9]
[4 2 3]]
[[5 2 3]]
输入是具有类似于列0值的数字的列表. 我想要的最终结果是以任何格式生成的行,例如数组,列表或元组
The input is list having number similar to column 0 values. The end result I want is the resulting rows in any format like array, list or tuple for example
array([[1, 2, 3],
[1, 2, 3],
[4, 8, 9],
[4, 2, 3],
[5, 2, 3]])
我的代码工作正常,但似乎不是pythonic.有没有更好的多值搜索策略?
My code works fine but doesn't seem pythonic. Is there any better searching strategy with multiple values?
类似于a[np.where(a[:,0] == l)]
,其中仅执行一次查找即可获取所有值.
like a[np.where(a[:,0] == l)]
where only one time lookup is done to get all the values.
我的真实数组很大
答
方法1:使用 方法2::使用 np.searchsorted
-
num_arr = np.sort(num_list) # Sort num_list and get as array
# Get indices of occurrences of first column in num_list
idx = np.searchsorted(num_arr, a[:,0])
# Take care of out of bounds cases
idx[idx==len(num_arr)] = 0
out = a[a[:,0] == num_arr[idx]]