I happened to fail to set character encoding in Python terminal on Windows. According to official guide, it\'s a piece of cake:
# -*- coding: utf-8 -*-
It produces mojibake because ''
is a bytestring literal in Python 2 (unless from __future__ import unicode_literals
is used). You are printing utf-8 bytes (the source code encoding) to Windows console that uses some other character encoding (the encoding is different if you see mojibake):
>>> print(u'Русский'.encode('utf-8').decode('cp866'))
╨а╤Г╤Б╤Б╨║╨╕╨╣
The solution is to print Unicode instead as @JBernardo suggested:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
print(u'Русский')
It works if the console encoding supports Cyrillic letters e.g., if it is cp866
.
If you want to redirect the output to a file; you could use PYTHONIOENCODING
environment variable to set the character encoding used by Python for I/O:
Z:\> set PYTHONIOENCODING=utf-8
Z:\> python your_script.py > output.utf-8.txt
If you want to print Unicode characters that can't be represented using the console encoding (OEM code page) then you could install win-unicode-console Python package:
Z:\> py -m pip install win_unicode_console
Z:\> py -m run your_script.py
In case anyone else gets this page when searching easiest is to set the windows terminal code page
CHCP 65001
or for power shell start it with
powershell.exe -NoExit /c "chcp.com 65001"
from Is there a Windows command shell that will display Unicode characters?
Update: See J.F. Sebastian's answer for a better explanation and a better solution.
# -*- coding: utf-8 -*-
sets the source file's encoding, not the output encoding.
You have to encode the string just before printing it with the exact same encoding that your terminal is using. In your case, I'm guessing that your code page is Cyrillic (cp866). Therefore,
print 'Русский'.encode("cp866")
You should use unicode:
print u'Русский'
or switch to python3 (unicode by default).