将值与numpy中的相邻元素进行比较
假设我有一个numpy数组
Let's say I have a numpy array
a b c
A = i j k
u v w
我想比较值中心元素和它的八个相邻元素中的一些(沿轴或沿对角线).除了嵌套的for循环之外,还有什么更快的方法(对于大型矩阵来说太慢了)?
I want to compare the value central element with some of its eight neighbor elements (along the axis or along the diagonal). Is there any faster way except the nested for loop (it's too slow for big matrix)?
更具体地说,我想做的是比较element的值和它的邻居并分配新的值.
To be more specific, what I want to do is compare value of element with it's neighbors and assign new values.
例如:
if (j == 1):
if (j>i) & (j>k):
j = 999
else:
j = 0
if (j == 2):
if (j>c) & (j>u):
j = 999
else:
j = 0
...
类似这样的东西.
您的操作包含许多条件,因此在一般情况下(任何类型的条件,任何类型的操作),最有效的方式是使用循环.使用numba或cython可以有效地完成此操作.在特殊情况下,您可以使用numpy/scipy中的更高级别的函数来实现它.我将为您提供的特定示例显示解决方案,希望您可以从那里进行概括.
Your operation contains lots of conditionals, so the most efficient way to do it in the general case (any kind of conditionals, any kind of operations) is using loops. This could be done efficiently using numba or cython. In special cases, you can implement it using higher level functions in numpy/scipy. I'll show a solution for the specific example you gave, and hopefully you can generalize from there.
从一些虚假数据开始:
A = np.asarray([
[1, 1, 1, 2, 0],
[1, 0, 2, 2, 2],
[0, 2, 0, 1, 0],
[1, 2, 2, 1, 0],
[2, 1, 1, 1, 2]
])
我们将在A
中找到适用各种条件的位置.
We'll find locations in A
where various conditions apply.
- 1a)值为1
- 1b)该值大于其水平邻域
- 2a)值为2
- 2b)该值大于其对角邻域
在A
中查找出现指定值的位置:
Find locations in A
where the specified values occur:
cond1a = A == 1
cond2a = A == 2
这将给出布尔值矩阵,其大小与A
相同.条件成立时该值为true,否则为false.
This gives matrices of boolean values, of the same size as A
. The value is true where the condition holds, otherwise false.
在A
中查找每个元素与相邻元素具有指定关系的位置:
Find locations in A
where each element has the specified relationships to its neighbors:
# condition 1b: value greater than horizontal neighbors
f1 = np.asarray([[1, 0, 1]])
cond1b = A > scipy.ndimage.maximum_filter(
A, footprint=f1, mode='constant', cval=-np.inf)
# condition 2b: value greater than diagonal neighbors
f2 = np.asarray([
[0, 0, 1],
[0, 0, 0],
[1, 0, 0]
])
cond2b = A > scipy.ndimage.maximum_filter(
A, footprint=f2, mode='constant', cval=-np.inf)
和以前一样,这会给出布尔值矩阵,指示条件为真.这段代码使用 scipy.ndimage.maximum_filter().此函数迭代地将足迹"移动到A
的每个元素上的中心.该位置的返回值是其足迹为1的所有元素的最大值.mode
参数指定如何处理足迹不在边缘的矩阵边界之外的隐式值.在这里,我们将它们视为负无穷大,与忽略它们相同(因为我们使用的是max操作).
As before, this gives matrices of boolean values indicating where the conditions are true. This code uses scipy.ndimage.maximum_filter(). This function iteratively shifts a 'footprint' to be centered over each element of A
. The returned value for that position is the maximum of all elements for which the footprint is 1. The mode
argument specifies how to treat implicit values outside boundaries of the matrix, where the footprint falls off the edge. Here, we treat them as negative infinity, which is the same as ignoring them (since we're using the max operation).
根据条件设置结果值.如果条件1a和1b都为真,或者条件2a和2b都为真,则该值为999.否则,值为0.
Set values of the result according to the conditions. The value is 999 if conditions 1a and 1b are both true, or if conditions 2a and 2b are both true. Else, the value is 0.
result = np.zeros(A.shape)
result[(cond1a & cond1b) | (cond2a & cond2b)] = 999
结果是:
[
[ 0, 0, 0, 0, 0],
[999, 0, 0, 999, 999],
[ 0, 0, 0, 999, 0],
[ 0, 0, 999, 0, 0],
[ 0, 0, 0, 0, 999]
]
您可以通过更改过滤器占用空间将这种方法推广到其他邻居模式.您可以使用其他类型的过滤器(请参见二维互相关.
You can generalize this approach to other patterns of neighbors by changing the filter footprint. You can generalize to other operations (minimum, median, percentiles, etc.) using other kinds of filters (see scipy.ndimage). For operations that can be expressed as weighted sums, use 2d cross correlation.
这种方法应该比在python中循环要快得多.但是,它确实执行了不必要的计算(例如,仅当值是1或2时才需要计算最大值,但是我们对所有元素都进行了计算).手动循环将使您避免这些计算.在python中循环可能比这里的代码慢得多.但是,用numba或cython实施它可能会更快,因为这些工具会生成编译后的代码.
This approach should be much faster than looping in python. But, it does perform unnecessary computations (for example, it's only necessary to compute the max when the value is 1 or 2, but we're doing it for all elements). Looping manually would let you avoid these computations. Looping in python would probably be much slower than the code here. But, implementing it in numba or cython would probably be faster because these tools generate compiled code.