In Python, for Japanese, Chinese, and Korean,Python can not print the correct strings, for example hello
in Japanese, Korean and Chinese are:
こん
What you see is the difference between
Or more generally, the difference between an objects "informal" and "official" string representation (see documentation).
In the first case, the unicode string will be printed correctly, as you would expect, with the unicode characters.
In the second case, the items of the list will be printed using their representation and not their string value.
for line in f.readlines():
print line
is the first (good) case, and
print f.readlines()
is the second case.
You can check the difference by this example:
a = u'ð€œłĸªßð'
print a
print a.__repr__()
l = [a, a]
print l
This shows the difference between the special __str__()
and __repr__()
methods which you can play with yourself.
class Person(object):
def __init__(self, name):
self.name = name
def __str__(self):
return self.name
def __repr__(self):
return ''.format(self.name)
p = Person('Donald')
print p # Prints 'Donald' using __str__
p # On the command line, prints '' using __repr__
I.e., the value you see when simply typing an object name on the console is defined by __repr__
while what you see when you use print
is defined by __str__
.