python is operator behaviour with string [duplicate]

别等时光非礼了梦想. 提交于 2019-11-26 18:24:34

问题


This question already has an answer here:

  • About the changing id of an immutable string 5 answers

I am unable to understand the following behaviour. I am creating 2 strings, and using is operator to compare it. On the first case, it is working differently. On the second case, it works as expected. What is the reason when I use comma or space, it is showing False on comparing with is and when no comma or space or other characters are used, it gives True

Python 3.6.5 (default, Mar 30 2018, 06:41:53) 
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 'string'
>>> b = a
>>> b is a
True
>>> b = 'string'
>>> b is a
True
>>> a = '1,2,3,4'
>>> b = a
>>> b is a
True
>>> b = '1,2,3,4'
>>> b is a
False

Is there a reliable information on why python interprets strings in different way? I understand that initially, a and b refers to same object. And then b gets a new object, still b is a says True. It is little confusing to understand the behaviour.

When I do it with 'string' - it produces same result. What's wrong when I use '1,2,3,4' - they both are strings. What's different from case 1 and case 2 ? i.e is operator producing different results for different contents of the strings.


回答1:


One important thing about this behavior is that Python caches some, mostly, short strings (usually less than 20 characters but not for every combinations of them) so that they become quickly accessible. One important reason for that is that strings are widely used in Pyhton's source code and it's an internal optimization to cache some special sorts of strings. Dictionaries are one of the generally used data structures in Python's source code that are used for preserving the variables, attributes, and namespaces in general, plus for some other purposes, and they all use strings as the object names. This is to say that every time you try to access an object attribute or have access to a variable (local or global) there's a dictionary look up firing up internally.

Now, the reason that you got such bizarre behavior is because Python (Cpython implementation) treats differently with strings in terms of interning. In Python's source code there is a intern_string_constants function that gives strings the validation to be interned which you can check for more details. Or check this comprehensive article http://guilload.com/python-string-interning/.

It's also note worthy that Python has an intern() function in sys module that you can use to intern strings manually.

In [52]: b = sys.intern('a,,')

In [53]: c = sys.intern('a,,')

In [54]: b is c
Out[54]: True

You can use this function either when you want to fasten the dictionary lookups or when you're ought to use a particular string object frequently in your code.

Another point that you should not confuse with string interning is that when you do a == b you're creating two references to the same object which is obvious for those keywords to have same id.

Regarding punctuations, it seems that if they are one character they get interned if their length is more than one.If the length is more than one they won't get cached. As mentioned in comments, one reason for that might be because it's less likely for keywords and dictionary keys to have punctuations in them.

In [28]: a = ','

In [29]: ',' is a
Out[29]: True

In [30]: a = 'abc,'

In [31]: 'abc,' is a
Out[31]: False

In [34]: a = ',,'

In [35]: ',,' is a
Out[35]: False

# Or

In [36]: a = '^'

In [37]: '^' is a
Out[37]: True

In [38]: a = '^%'

In [39]: '^%' is a
Out[39]: False

But still these are just some speculations that you cannot rely on in you codes.



来源:https://stackoverflow.com/questions/50037548/python-is-operator-behaviour-with-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!