Why should we NOT use sys.setdefaultencoding(“utf-8”) in a py script?

前端 未结 4 1486
孤街浪徒
孤街浪徒 2020-11-22 03:11

I have seen few py scripts which use this at the top of the script. In what cases one should use it?

import sys
reload(sys)
sys.setdefaultencoding(\"utf-8\")         


        
4条回答
  •  迷失自我
    2020-11-22 03:50

    • The first danger lies in reload(sys).

      When you reload a module, you actually get two copies of the module in your runtime. The old module is a Python object like everything else, and stays alive as long as there are references to it. So, half of the objects will be pointing to the old module, and half to the new one. When you make some change, you will never see it coming when some random object doesn't see the change:

      (This is IPython shell)
      
      In [1]: import sys
      
      In [2]: sys.stdout
      Out[2]: 
      
      In [3]: reload(sys)
      
      
      In [4]: sys.stdout
      Out[4]: ', mode 'w' at 0x00000000022E20C0>
      
      In [11]: import IPython.terminal
      
      In [14]: IPython.terminal.interactiveshell.sys.stdout
      Out[14]: 
      
    • Now, sys.setdefaultencoding() proper

      All that it affects is implicit conversion str<->unicode. Now, utf-8 is the sanest encoding on the planet (backward-compatible with ASCII and all), the conversion now "just works", what could possibly go wrong?

      Well, anything. And that is the danger.

      • There may be some code that relies on the UnicodeError being thrown for non-ASCII input, or does the transcoding with an error handler, which now produces an unexpected result. And since all code is tested with the default setting, you're strictly on "unsupported" territory here, and no-one gives you guarantees about how their code will behave.
      • The transcoding may produce unexpected or unusable results if not everything on the system uses UTF-8 because Python 2 actually has multiple independent "default string encodings". (Remember, a program must work for the customer, on the customer's equipment.)
        • Again, the worst thing is you will never know that because the conversion is implicit -- you don't really know when and where it happens. (Python Zen, koan 2 ahoy!) You will never know why (and if) your code works on one system and breaks on another. (Or better yet, works in IDE and breaks in console.)

提交回复
热议问题