问题
I have been using numpy
for quite a while but I stumbled upon one thing that I didn't understand fully:
a = np.ones(20)
b = np.zeros(10)
print(id(a)==id(b)) # prints False
print(id(a), id(b)) # prints (4591424976, 4590843504)
print(id(a[0])==id(b[0])) # prints True
print(id(a[0]), id(b[0])) # prints (4588947064, 4588947064)
print(id(a[0])) # 4588947184
print(id(b[0])) # 4588947280
Can someone please explain the behavior observed in last four print statements? Also, I was aware of the fact that id
gives you unique object id actually allocated in the memory but every time I run the last two print statements, I got different id
values. Is this the expected behavior?
回答1:
The short answer is that you should forget about relying on id
to try and gain deep insight into the workings of python. Its output is affected by cpython implementation details, peephole optimizations and memory reuse. More often than not id
is a red herring. This is especially true with numpy.
In your specific case only a
and b
exist as python objects. When you take an element, a[0]
, you instantiate a new python object, a scalar of type numpy.float64
(or maybe numpy.float32
depending on your system). These are new python objects and are thus given a new id
, unless the interpreter realizes that you're trying to use this object twice (this is probably what's happening in your middle example, although I do find it surprising that two numpy.float64
objects with different values are given the same id
. But the weird magic goes away if you assign a[0]
and b[0]
to proper names first, so this is probably due to some optimization). It could also happen that memory addresses get reused by the interpreter, giving you id
s that have appeared before.
Just to see how pointless id
is with numpy, even trivial views are new python objects with new id
s, even though for all intents and purposes they are as good as the original:
>>> arr = np.arange(3)
>>> id(arr)
140649669302992
>>> id(arr[...])
140649669667056
And here's an example for id
reuse in an interactive shell:
>>> id(np.arange(3))
140649669027120
>>> id(np.arange(3))
140649669028480
>>> id(np.arange(3))
140649669026480
Surely there's no such thing as int interning for numpy arrays, so the above is only due to the interpreter reusing id
s. The fact that id
returns a memory address is again just a cpython implementation detail. Forget about id
.
The only thing you might want to use with numpy is numpy.may_share_memory and numpy.shares_memory.
回答2:
It's important to note that everything in Python is an object, even numbers and Classes. You have taken 2 numpy array object and each of contains same values i.e 0. When you say:
print('id of 0 =',id(0))
a = 0
print('id of a =',id(a))
b = a
print('id of b =',id(b))
c = 0.0
print('id of c =',id(c))
The answer you get something like (your case it's different):
id of 0 = 140472391630016
id of a = 140472391630016
id of b = 140472391630016
id of c = 140472372786520
Hence, integer 0
has a unique id. The id of the integer 0
remains constant during the lifetime. Similar is the case for float 0.0
and other objects.
So in your case a[0]
or b[0]
object id of zero will remain same until or unless it is alive because both contains 0
as object value.
Each time you print a[0]
or b[0]
in different line you return it's different identity of object because you triggering it at different line hence different lifetime.
You can try:
print(id(a)==id(b))
print(id(a),id(b))
print(id(a[0])==id(b[0]))
print(id(a[0]),id(b[0]))
The output will be:
False
2566443478752 2566448028528
True
2566447961120 2566447961120
Note that second line will return to you 2 different identity of object of numpy array type because both are different list.
来源:https://stackoverflow.com/questions/54624824/comparing-object-ids-of-two-numpy-arrays