关于numpy数组如何存储在Python中的一些困惑
在Python中使用数据类型numpy数组时,我有些困惑。
I have some confusions when playing with data type numpy array in Python.
问题1
我在python解释器中执行以下脚本
I execute the following scripts in python intepreter
>>> import numpy as np
>>> L = [1000,2000,3000]
>>> A = np.array(L)
>>> B = A
然后我检查以下内容:
>>> A is B
True
>>> id(A) == id(B)
True
>>> id(A[0]) == id(B[0])
True
没关系。但是随后发生了一些奇怪的事情。
That's fine. But some strange things happened then.
>>> A[0] is B[0]
False
但是A [0]和B [0]是不同的东西吗?它们具有相同的ID!
对于python中的列表,我们有
But how can A[0] and B[0] be different things? They have the same id! For List in python, we have
>>> LL = [1000,2000,3000]
>>> SS = LL
>>> LL[0] is SS[0]
True
存储numpy数组的方法是与清单完全不同?而且我们还有
The method to store numpy array is totally different with list? And we also have
>>> A[0] = 1001
>>> B[0]
1001
似乎A [0]和B [0]是相同的对象。
It seems that A[0] and B[0] is the identical objects.
Question2
Question2
我复制A。
>>> C = A[:]
>>> C is A
False
>>> C[0] is A[0]
False
这很好。 A和C似乎彼此独立。但是
That is fine. A and C seem to be independent with each other. But
>>> A[0] = 1002
>>> C[0]
1002
似乎A和C不是独立的吗?我完全感到困惑。
It seems that A and C is not independent? I am totally confused.
您在问两个完全独立的问题,所以这里有两个答案。
You are asking two completely independent questions, so here's two answsers.
-
Numpy数组的数据在内部存储为连续的C数组。数组中的每个条目都是一个数字。另一方面,Python对象需要一些内部数据,例如引用计数和指向类型对象的指针。您不能简单地将原始指针指向内存中的数字。因此,如果您访问单个元素,Numpy会将一个数字装箱在Python对象中。每当您访问元素时都会发生这种情况,因此即使
A [0]
和A [0]
也是不同的对象:
>>> A[0] is A[0]
False
这是原因所在Numpy可以以更节省内存的方式存储数组:它不为每个条目存储完整的Python对象,而仅在需要时动态创建这些对象。
This is at the heart of why Numpy can store arrays in a more memory-efficient way: It does not store a full Python object for each entry, and only creates these objects on the fly when needed. It is optimised for vectorised operations on the array, not for individual element access.
当您执行 C = A [:] 您将为相同的数据创建一个新视图。您没有进行复制。然后,您将有两个不同的包装器对象,分别由
A
和 C
指向,但是它们由相同的支持缓冲。数组的 base
属性引用其最初从以下对象创建的数组对象:
When you execute C = A[:]
you are creating a new view for the same data. You are not making a copy. You will then have two different wrapper objects, pointed to by A
and C
respectively, but they are backed by the same buffer. The base
attribute of an array refers to the array object it was originally created from:
>>> A.base is None
True
>>> C.base is A
True
结合使用相同数据的新视图特别有用使用索引,因为您可以获得的视图仅包含原始数组的一部分,但具有相同的内存支持。
New views on the same data are particularly useful when combined with indexing, since you can get views that only include some slice of the original array, but are backed by the same memory.
要实际复制数组,使用 copy()
方法。
To actually make a copy of an array, use the copy()
method.
更多通常,您不应该过多地了解Python中的对象身份。通常,如果 x为y
为真,则您知道它们实际上是同一对象。但是,如果返回false,则它们仍然可以是同一对象的两个不同代理。
As a more general remark, you should not read too much into object identity in Python. In general, if x is y
is true, you know that they are really the same object. However, if this returns false, they can still be two different proxies to the same object.