Difference between open and codecs.open in Python

前端 未结 8 527
一整个雨季
一整个雨季 2020-12-04 09:53

There are two ways to open a text file in Python:

f = open(filename)

And

import codecs
f = codecs.open(filename, encoding=\         


        
8条回答
  •  感动是毒
    2020-12-04 10:09

    Since Python 2.6, a good practice is to use io.open(), which also takes an encoding argument, like the now obsolete codecs.open(). In Python 3, io.open is an alias for the open() built-in. So io.open() works in Python 2.6 and all later versions, including Python 3.4. See docs: http://docs.python.org/3.4/library/io.html

    Now, for the original question: when reading text (including "plain text", HTML, XML and JSON) in Python 2 you should always use io.open() with an explicit encoding, or open() with an explicit encoding in Python 3. Doing so means you get correctly decoded Unicode, or get an error right off the bat, making it much easier to debug.

    Pure ASCII "plain text" is a myth from the distant past. Proper English text uses curly quotes, em-dashes, bullets, € (euro signs) and even diaeresis (¨). Don't be naïve! (And let's not forget the Façade design pattern!)

    Because pure ASCII is not a real option, open() without an explicit encoding is only useful to read binary files.

提交回复
热议问题