I have a socket server that is supposed to receive UTF-8 valid characters from clients.
The problem is some clients (mainly hackers) are sending all the wrong kind of
This type of issue crops up for me now that I've moved to Python 3. I had no idea Python 2 was simply steam rolling any issues with file encoding.
I found this nice explanation of the differences and how to find a solution after none of the above worked for me.
http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html
In short, to make Python 3 behave as similarly as possible to Python 2 use:
with open(filename, encoding="latin-1") as datafile:
# work on datafile here
However, read the article, there is no one size fits all solution.