I want to read a file that contains also German and not only characters. I found that i can do like this
>>> import codecs
>
You need to know which character encoding the text is encoded in. If you don't know that beforehand, you can try guessing it with the chardet module. First install it:
$ pip install chardet
Then, for example reading the file in binary mode:
>>> import chardet
>>> chardet.detect(open("file.txt", "rb").read())
{'confidence': 0.9690625, 'encoding': 'utf-8'}
So then:
>>> import codecs
>>> import unicodedata
>>> lines = codecs.open('file.txt', 'r', encoding='utf-8').readlines()