Python 列表/数组:在切片中禁用负索引环绕

问题描述:

虽然我发现负数环绕(即 A[-2] 索引倒数第二个元素)在许多情况下非常有用,但当它发生在切片内时,通常更多比有用的功能更烦人,我经常希望有一种方法可以禁用该特定行为.

While I find the negative number wraparound (i.e. A[-2] indexing the second-to-last element) extremely useful in many cases, when it happens inside a slice it is usually more of an annoyance than a helpful feature, and I often wish for a way to disable that particular behaviour.

下面是一个固定的 2D 示例,但我对其他数据结构和其他维度的数据也有过几次同样的不满.

Here is a canned 2D example below, but I have had the same peeve a few times with other data structures and in other numbers of dimensions.

import numpy as np
A = np.random.randint(0, 2, (5, 10))

def foo(i, j, r=2):
  '''sum of neighbours within r steps of A[i,j]'''
  return A[i-r:i+r+1, j-r:j+r+1].sum()

在上面的切片中,我希望切片的任何负数都被视为与 None 相同,而不是包装到数组的另一端.

In the slice above I would rather that any negative number to the slice would be treated the same as None is, rather than wrapping to the other end of the array.

由于包装的原因,上面的其他很好的实现在边界条件下给出了不正确的结果,并且需要某种补丁,例如:

Because of the wrapping, the otherwise nice implementation above gives incorrect results at boundary conditions and requires some sort of patch like:

def ugly_foo(i, j, r=2):
  def thing(n):
    return None if n < 0 else n
  return A[thing(i-r):i+r+1, thing(j-r):j+r+1].sum()

我也尝试过对数组或列表进行零填充,但它仍然不够优雅(需要相应地调整查找位置索引)并且效率低下(需要复制数组).

I have also tried zero-padding the array or list, but it is still inelegant (requires adjusting the lookup locations indices accordingly) and inefficient (requires copying the array).

我是否缺少一些标准技巧或优雅的切片解决方案?我注意到 python 和 numpy 已经很好地处理了你指定的数字太大的情况——也就是说,如果索引大于数组的形状,它的行为与 None 一样.

Am I missing some standard trick or elegant solution for slicing like this? I noticed that python and numpy already handle the case where you specify too large a number nicely - that is, if the index is greater than the shape of the array it behaves the same as if it were None.

我的猜测是您必须围绕所需的对象创建自己的子类包装器并重新实现 __getitem__() 以进行转换None 的否定键,然后调用超类 __getitem__

My guess is that you would have to create your own subclass wrapper around the desired objects and re-implement __getitem__() to convert negative keys to None, and then call the superclass __getitem__

注意,我的建议是对现有的自定义类进行子类化,而不是像 listdict 这样的内置类.这只是为了围绕另一个类创建一个实用程序,而不是混淆 list 类型的正常预期操作.您可能希望在特定上下文中使用一段时间,直到您的操作完成.最好避免进行全局不同的更改,以免混淆您的代码的用户.

Note, what I am suggesting is to subclass existing custom classes, but NOT builtins like list or dict. This is simply to make a utility around another class, not to confuse the normal expected operations of a list type. It would be something you would want to use within a certain context for a period of time until your operations are complete. It is best to avoid making a globally different change that will confuse users of your code.

数据模型

object.getitem(self, key)
被要求实施评估自我[键].对于序列类型,接受的键应该是整数和切片对象.注意否定的特殊解释索引(如果类希望模拟序列类型)取决于getitem() 方法.如果 key 的类型不合适,可能会引发 TypeError;如果是在索引集之外的值序列(在对负值进行任何特殊解释之后),IndexError 应该被提高.对于映射类型,如果缺少键(不是在容器中),应该引发 KeyError.

object.getitem(self, key)
Called to implement evaluation of self[key]. For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the getitem() method. If key is of an inappropriate type, TypeError may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised. For mapping types, if key is missing (not in the container), KeyError should be raised.

您甚至可以创建一个包装器,该包装器仅将实例作为参数,并在转换密钥时将所有 __getitem__() 调用推迟到该私有成员,以防万一或者不想对类型进行子类化,而只想为任何序列对象提供实用程序包装器.

You could even create a wrapper that simply takes an instance as an arg, and just defers all __getitem__() calls to that private member, while converting the key, for cases where you can't or don't want to subclass a type, and instead just want a utility wrapper for any sequence object.

后一种建议的快速示例:

Quick example of the latter suggestion:

class NoWrap(object):

    def __init__(self, obj, default=None):
        self._obj = obj 
        self._default = default

    def __getitem__(self, key):
        if isinstance(key, int):
            if key < 0:
                return self._default

        return self._obj.__getitem__(key)

In [12]: x = range(-10,10)
In [13]: x_wrapped = NoWrap(x)
In [14]: print x_wrapped[5]
-5
In [15]: print x_wrapped[-1]
None 
In [16]: x_wrapped = NoWrap(x, 'FOO')
In [17]: print x_wrapped[-1]
FOO