python is operator behaviour with string

混江龙づ霸主 提交于 2019-11-27 14:49:37

One important thing about this behavior is that Python caches some, mostly, short strings (usually less than 20 characters but not for every combinations of them) so that they become quickly accessible. One important reason for that is that strings are widely used in Pyhton's source code and it's an internal optimization to cache some special sorts of strings. Dictionaries are one of the generally used data structures in Python's source code that are used for preserving the variables, attributes, and namespaces in general, plus for some other purposes, and they all use strings as the object names. This is to say that every time you try to access an object attribute or have access to a variable (local or global) there's a dictionary look up firing up internally.

Now, the reason that you got such bizarre behavior is because Python (Cpython implementation) treats differently with strings in terms of interning. In Python's source code there is a intern_string_constants function that gives strings the validation to be interned which you can check for more details. Or check this comprehensive article http://guilload.com/python-string-interning/.

It's also note worthy that Python has an intern() function in sys module that you can use to intern strings manually.

In [52]: b = sys.intern('a,,')

In [53]: c = sys.intern('a,,')

In [54]: b is c
Out[54]: True

You can use this function either when you want to fasten the dictionary lookups or when you're ought to use a particular string object frequently in your code.

Another point that you should not confuse with string interning is that when you do a == b you're creating two references to the same object which is obvious for those keywords to have same id.

Regarding punctuations, it seems that if they are one character they get interned if their length is more than one.If the length is more than one they won't get cached. As mentioned in comments, one reason for that might be because it's less likely for keywords and dictionary keys to have punctuations in them.

In [28]: a = ','

In [29]: ',' is a
Out[29]: True

In [30]: a = 'abc,'

In [31]: 'abc,' is a
Out[31]: False

In [34]: a = ',,'

In [35]: ',,' is a
Out[35]: False

# Or

In [36]: a = '^'

In [37]: '^' is a
Out[37]: True

In [38]: a = '^%'

In [39]: '^%' is a
Out[39]: False

But still these are just some speculations that you cannot rely on in you codes.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!