I am dealing with unicode strings returned by the python-lastfm library.
I assume somewhere on the way, the library gets the encoding wrong and returns a unicode str
I stumble upon this bug myself while processing a file containing german words that I was unaware it has been encoded in UTF-8. The problem manifest itself when I start processing words and some of them would't show the decoding error.
# python
Python 2.7.12 (default, Aug 22 2019, 16:36:40)
>>> utf8_word = u"Gl\xfcck"
>>> print("Word read was: {}".format(utf8_word))
Traceback (most recent call last):
File "", line 1, in
UnicodeEncodeError: 'ascii' codec can't encode character u'\xfc' in position 2: ordinal not in range(128)
I solve the error calling the encode method on the string:
>>> print("Word read was: {}".format(utf8_word.encode('utf-8')))
Word read was: Glück