Python 3 UnicodeEncodeError: 'ascii' codec can't encode characters

爱⌒轻易说出口 提交于 2019-12-12 08:43:30

问题


I've just started to learn Python but I already ran into troubles.
I have a simple script with just one command:

#!/usr/bin/env python3
print("Příliš žluťoučký kůň úpěl ďábelské ódy.") # Text in Czech 

When I try to run this script:

python3 hello.py 

I get this message:

Traceback (most recent call last):
  File "hello.py", line 2, in <module>
    print("P\u0159\xedli\u0161 \u017elu\u0165ou\u010dk\xfd k\u016fn \xfap\u011bl \u010f\xe1belsk\xe9 \xf3dy.")
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-2: ordinal not in range(128)

I am using Kubuntu 16.04 and Python 3.5.2. When I tried this: export PYTHONIOENCODING=utf-8 It worked but only temporarily. Next time I opened bash I got the same error.

According to https://docs.python.org/3/howto/unicode.html#the-string-type the default encoding for Python source code is UTF-8.
So I have the source file saved id UTF-8, Konsole is set to UTF-8 but I still get the error!
Even if I add

# -*- coding: utf-8 -*-

to the beginning it does nothing.

Another weird thing: when I run it using only python, not python3, it works. How is it possible to work in Python 2.7.12 and not in 3.5.2?

Any ideas for solving this permanently? Thank you.


回答1:


Thanks to Mark Tolen and Alastair McCormack for suggesting where the problem may be. The problem was really in the locale settings.
When I ran locale, the output was:

LANG=C
LANGUAGE=
LC_CTYPE="C"
LC_NUMERIC=cs_CZ.UTF-8
LC_TIME=cs_CZ.UTF-8
LC_COLLATE=cs_CZ.UTF-8
LC_MONETARY=cs_CZ.UTF-8
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT=cs_CZ.UTF-8
LC_IDENTIFICATION="C"
LC_ALL=

This "C" is the default setting which uses the ANSI charmap. And that is where the problem was. Running locale charmap gave me: ANSI_X3.4-1968 which can not display non-English characters.
I fixed this using this Ubuntu documentation site.

I added these lines to /etc/default/locale:

LANGUAGE=cs_CZ.UTF-8
LC_ALL=cs_CZ.UTF-8

Then you have to restart your session (log out and in) to apply these settings.

Running locale now returns this output:

LANG=C
LANGUAGE=cs
LC_CTYPE="cs_CZ.UTF-8"
LC_NUMERIC="cs_CZ.UTF-8"
LC_TIME="cs_CZ.UTF-8"
LC_COLLATE="cs_CZ.UTF-8"
LC_MONETARY="cs_CZ.UTF-8"
LC_MESSAGES="cs_CZ.UTF-8"
LC_PAPER="cs_CZ.UTF-8"
LC_NAME="cs_CZ.UTF-8"
LC_ADDRESS="cs_CZ.UTF-8"
LC_TELEPHONE="cs_CZ.UTF-8"
LC_MEASUREMENT="cs_CZ.UTF-8"
LC_IDENTIFICATION="cs_CZ.UTF-8"
LC_ALL=cs_CZ.UTF-8

and running locale charmap returns:

UTF-8


来源:https://stackoverflow.com/questions/41408791/python-3-unicodeencodeerror-ascii-codec-cant-encode-characters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!