How to replace a value in pandas, with NaN?

后端 未结 6 1669
无人共我
无人共我 2020-12-01 06:57

I am new to pandas , I am trying to load the csv in Dataframe. My data has missing values represented as ? , and I am trying to replace it with standard Missing values - NaN

6条回答
  •  死守一世寂寞
    2020-12-01 07:24

    You can replace this just for that column using replace:

    df['workclass'].replace('?', np.NaN)
    

    or for the whole df:

    df.replace('?', np.NaN)
    

    UPDATE

    OK I figured out your problem, by default if you don't pass a separator character then read_csv will use commas ',' as the separator.

    Your data and in particular one example where you have a problematic line:

    54, ?, 180211, Some-college, 10, Married-civ-spouse, ?, Husband, Asian-Pac-Islander, Male, 0, 0, 60, South, >50K
    

    has in fact a comma and a space as the separator so when you passed the na_value=['?'] this didn't match because all your values have a space character in front of them all which you can't observe.

    if you change your line to this:

    rawfile = pd.read_csv(filename, header=None, names=DataLabels, sep=',\s', na_values=["?"])
    

    then you should find that it all works:

    27      54               NaN  180211  Some-college             10 
    

提交回复
热议问题