Unexpected behavior of universal newline mode with StringIO and csv modules

♀尐吖头ヾ 提交于 2019-12-12 11:50:33

问题


Consider the following (Python 3.2 under Windows):

>>> import io
>>> import csv
>>> output = io.StringIO()         # default parameter newline=None
>>> csvdata = [1, 'a', 'Whoa!\nNewlines!']
>>> writer = csv.writer(output, quoting=csv.QUOTE_NONNUMERIC)
>>> writer.writerow(csvdata)
25
>>> output.getvalue()
'1,"a","Whoa!\nNewlines!"\r\n'

Why is there a single \n - shouldn't it have been converted to \r\n since universal newlines mode is enabled?

With this enabled, on input, the lines endings \n, \r, or \r\n are translated to \n before being returned to the caller. Conversely, on output, \n is translated to the system default line separator, os.linesep.


回答1:


The "single" \n occurs as a data character inside the third field. Consequently that field is quoted so that a csv reader will treat it as part of the data. It is NOT a "line terminator" (should be called a row separator) or part thereof. To get a better appreciation of the quoting, remove the quoting=csv.QUOTE_NONNUMERIC.

The \r\n is produced because csv terminates rows with the dialect.lineterminator whose default is \r\n. In other words, the "universal newlines" setting is ignored.

Update

The 2.7 and 3.2 docs for io.StringIO are virtually identical as far as the newline arg is concerned.

The newline argument works like that of TextIOWrapper. The default is to do no newline translation.

We'll examine the first sentence below. The second sentence is true for output, depending on your interpretation of "default" and "newline translation".

TextIOWrapper docs:

newline can be None, '', '\n', '\r', or '\r\n'. It controls the handling of line endings. If it is None, universal newlines is enabled. With this enabled, on input, the lines endings '\n', '\r', or '\r\n' are translated to '\n' before being returned to the caller. Conversely, on output, '\n' is translated to the system default line separator, os.linesep. If newline is any other of its legal values, that newline becomes the newline when the file is read and it is returned untranslated. On output, '\n' is converted to the newline.

Python 3.2 on Windows:

>>> from io import StringIO as S
>>> import os
>>> print(repr(os.linesep))
'\r\n'
>>> ss = [S()] + [S(newline=nl) for nl in (None, '', '\n', '\r', '\r\n')]
>>> for x, s in enumerate(ss):
...     m = s.write('foo\nbar\rzot\r\n')
...     v = s.getvalue()
...     print(x, m, len(v), repr(v))
...
0 13 13 'foo\nbar\rzot\r\n'
1 13 12 'foo\nbar\nzot\n'
2 13 13 'foo\nbar\rzot\r\n'
3 13 13 'foo\nbar\rzot\r\n'
4 13 13 'foo\rbar\rzot\r\r'
5 13 15 'foo\r\nbar\rzot\r\r\n'
>>>

Line 0 shows that the "default" that you get with no newline arg involves no translation of \n (or any other character). It is certainly NOT converting '\n' to os.linesep

Line 1 shows that what you get with newline=None (should be the same as line 0, shouldn't it??) is in effect INPUT universal newlines translation -- bizarre!

Line 2: newline='' does no change, like line 0. It is certainly NOT converting '\n' to ''.

Lines 3, 4, and 5: as the docs say, '\n' is converted to the value of the newline arg.

The equivalent Python 2.X code produces equivalent results with Python 2.7.2.

Update 2 For consistency with built-in open(), the default should be os.linesep, as documented. To get the no-translation-on-output behaviour, use newline=''. Note: the open() docs are much clearer. I'll submit a bug report tomorrow.




回答2:


From the docs for StringIO:

The newline argument works like that of TextIOWrapper. The default is to do no newline translation.

So StringIO is not doing any newline translation normally. That default makes sense - StringIO isn't writing to disk, so it doesn't need to translate to the platform-specific newlines.

As John pointed out, the csv module does its own universal newlines, but only for row endings, not for newlines within strings.



来源:https://stackoverflow.com/questions/9157623/unexpected-behavior-of-universal-newline-mode-with-stringio-and-csv-modules

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!