Adding a line-terminator in pandas ends up adding another \r

后端 未结 2 1181
故里飘歌
故里飘歌 2020-12-21 23:56

I am able to load a csv file fine into a pandas dataframe with the panda defaults:

df = pd.read_csv(file)

>>> df
   distance  recession_velocity
0          


        
相关标签:
2条回答
  • 2020-12-21 23:58

    Python's file objects will automatically translate \r\n to \n in text mode. read_csv uses its own file handling, it will indeed see \r\n instead, so if you pass lineterminator="\n" it will really just trim that one character.

    If you don't pass the lineterminator parameter at all, it will guess the line-ending style. You can also pass in a file object instead of a path. This may slow things down a bit, but it will give you the same transformation behaviour that you see when you do a straight read.

    0 讨论(0)
  • 2020-12-22 00:02

    To follow up on @filmor's answer, to print the non-python-converted data to see what is actually there, use the binary mode. For example:

    >>> open('example.csv','r+b').read()
    b'distance,recession_velocity\r\n# not a row,\r\n0.032,170\r\n0.034,290\r\n0.214,-130\r\n0.263,-70\r\n0.275,-185\r\n0.275,-220\r\n0.4,200\r\n0.5,290\r\n0.5,270\r\n0.6,200\r\n0.8,300\r\n0.9,-30\r\n0.9,650\r\n0.9,150\r\n0.9,500\r\n1,920\r\n1.1,450\r\n1.1,500\r\n1.4,500\r\n1.7,960\r\n2,500\r\n2,850\r\n2,800\r\n2,1090\r\n# Total,527'
    

    Here you can see that the line separator is \r\n, even though without using the +b mode it shows up as only \n. However, pandas doesn't yet support multi-line lineterminators so this introduces another issue.

    0 讨论(0)
提交回复
热议问题