Using pandas to read text file with leading whitespace gives a NaN column

ⅰ亾dé卋堺 提交于 2021-02-05 20:35:23

问题


I am using pandas.read_csv to read a whitespace delimited file. The file has a variable number of whitespace characters in front of every line (the numbers are right-aligned). When I read this file, it creates a column of NaN. Why does this happen, and what is the best way to prevent it?

Example:

Text file:

  9.0  3.3 4.0
 32.3 44.3 5.1
  7.2  1.1 0.9

Command:

import pandas as pd
pd.read_csv("test.txt",delim_whitespace=True,header=None)

Output:

    0     1     2    3
0 NaN   9.0   3.3  4.0
1 NaN  32.3  44.3  5.1
2 NaN   7.2   1.1  0.9

回答1:


FWIW I tend to use \s+ instead, and it doesn't suffer the same problem:

>>> pd.read_csv("wspace.csv", header=None, delim_whitespace=True)
    0     1     2    3
0 NaN   9.0   3.3  4.0
1 NaN  32.3  44.3  5.1
2 NaN   7.2   1.1  0.9
>>> pd.read_csv("wspace.csv", header=None, sep=r"\s+")
      0     1    2
0   9.0   3.3  4.0
1  32.3  44.3  5.1
2   7.2   1.1  0.9


来源:https://stackoverflow.com/questions/16022094/using-pandas-to-read-text-file-with-leading-whitespace-gives-a-nan-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!