Setting the correct encoding when piping stdout in Python

后端 未结 10 2580
迷失自我
迷失自我 2020-11-22 01:21

When piping the output of a Python program, the Python interpreter gets confused about encoding and sets it to None. This means a program like this:

# -*- co         


        
10条回答
  •  一整个雨季
    2020-11-22 02:21

    Your code works when run in an script because Python encodes the output to whatever encoding your terminal application is using. If you are piping you must encode it yourself.

    A rule of thumb is: Always use Unicode internally. Decode what you receive, and encode what you send.

    # -*- coding: utf-8 -*-
    print u"åäö".encode('utf-8')
    

    Another didactic example is a Python program to convert between ISO-8859-1 and UTF-8, making everything uppercase in between.

    import sys
    for line in sys.stdin:
        # Decode what you receive:
        line = line.decode('iso8859-1')
    
        # Work with Unicode internally:
        line = line.upper()
    
        # Encode what you send:
        line = line.encode('utf-8')
        sys.stdout.write(line)
    

    Setting the system default encoding is a bad idea, because some modules and libraries you use can rely on the fact it is ASCII. Don't do it.

提交回复
热议问题