Set encoding in Python 3 CGI scripts

前端 未结 7 1825
醉话见心
醉话见心 2020-12-03 03:07

When writing a Python 3.1 CGI script, I run into horrible UnicodeDecodeErrors. However, when running the script on the command line, everything works.

7条回答
  •  攒了一身酷
    2020-12-03 03:47

    You shouldn't read your IO streams as strings for CGI/WSGI; they aren't Unicode strings, they're explicitly byte sequences.

    (Consider that Content-Length is measured in bytes and not characters; imagine trying to read a multipart/form-data binary file upload submission crunched into UTF-8-decoded strings, or return a binary file download...)

    So instead use sys.stdin.buffer and sys.stdout.buffer to get the raw byte streams for stdio, and read/write binary with them. It is up to the form-reading layer to convert those bytes into Unicode string parameters where appropriate using whichever encoding your web page has.

    Unfortunately the standard library CGI and WSGI interfaces don't get this right in Python 3.1: the relevant modules were crudely converted from the Python 2 originals using 2to3 and consequently there are a number of bugs that will end up in UnicodeError.

    The first version of Python 3 that is usable for web applications is 3.2. Using 3.0/3.1 is pretty much a waste of time. It took a lamentably long time to get this sorted out and PEP3333 passed.

提交回复
热议问题