关于numpy数组如何存储在Python中的一些困惑

问题描述：

在Python中使用数据类型numpy数组时，我有些困惑。

I have some confusions when playing with data type numpy array in Python.

问题1

我在python解释器中执行以下脚本

I execute the following scripts in python intepreter

>>> import numpy as np
>>> L = [1000,2000,3000]
>>> A = np.array(L)
>>> B = A

然后我检查以下内容：

>>> A is B
True
>>> id(A) == id(B)
True
>>> id(A[0]) == id(B[0])
True

没关系。但是随后发生了一些奇怪的事情。

That's fine. But some strange things happened then.

>>> A[0] is B[0]
False

但是A [0]和B [0]是不同的东西吗？它们具有相同的ID！
对于python中的列表，我们有

But how can A[0] and B[0] be different things? They have the same id! For List in python, we have

>>> LL = [1000,2000,3000]
>>> SS = LL
>>> LL[0] is SS[0]
True

存储numpy数组的方法是与清单完全不同？而且我们还有

The method to store numpy array is totally different with list? And we also have

>>> A[0] = 1001
>>> B[0]
1001

似乎A [0]和B [0]是相同的对象。

It seems that A[0] and B[0] is the identical objects.

Question2

我复制A。

>>> C = A[:]
>>> C is A
False
>>> C[0] is A[0]
False

这很好。 A和C似乎彼此独立。但是

That is fine. A and C seem to be independent with each other. But

>>> A[0] = 1002
>>> C[0]
1002

似乎A和C不是独立的吗？我完全感到困惑。

It seems that A and C is not independent? I am totally confused.

答

您在问两个完全独立的问题，所以这里有两个答案。

You are asking two completely independent questions, so here's two answsers.

Numpy数组的数据在内部存储为连续的C数组。数组中的每个条目都是一个数字。另一方面，Python对象需要一些内部数据，例如引用计数和指向类型对象的指针。您不能简单地将原始指针指向内存中的数字。因此，如果您访问单个元素，Numpy会将一个数字装箱在Python对象中。每当您访问元素时都会发生这种情况，因此即使 A [0] 和 A [0] 也是不同的对象：

>>> A[0] is A[0]
False

这是原因所在Numpy可以以更节省内存的方式存储数组：它不为每个条目存储完整的Python对象，而仅在需要时动态创建这些对象。

This is at the heart of why Numpy can store arrays in a more memory-efficient way: It does not store a full Python object for each entry, and only creates these objects on the fly when needed. It is optimised for vectorised operations on the array, not for individual element access.

当您执行 C = A [：] 您将为相同的数据创建一个新视图。您没有进行复制。然后，您将有两个不同的包装器对象，分别由 A 和 C 指向，但是它们由相同的支持缓冲。数组的 base 属性引用其最初从以下对象创建的数组对象：

When you execute C = A[:] you are creating a new view for the same data. You are not making a copy. You will then have two different wrapper objects, pointed to by A and C respectively, but they are backed by the same buffer. The base attribute of an array refers to the array object it was originally created from:

>>> A.base is None
True
>>> C.base is A
True

结合使用相同数据的新视图特别有用使用索引，因为您可以获得的视图仅包含原始数组的一部分，但具有相同的内存支持。

New views on the same data are particularly useful when combined with indexing, since you can get views that only include some slice of the original array, but are backed by the same memory.

要实际复制数组，使用 copy（）方法。

To actually make a copy of an array, use the copy() method.

更多通常，您不应该过多地了解Python中的对象身份。通常，如果 x为y 为真，则您知道它们实际上是同一对象。但是，如果返回false，则它们仍然可以是同一对象的两个不同代理。

As a more general remark, you should not read too much into object identity in Python. In general, if x is y is true, you know that they are really the same object. However, if this returns false, they can still be two different proxies to the same object.

关于numpy数组如何存储在Python中的一些困惑

相关推荐