Is there a simple way to change a column of yes/no to 1/0 in a Pandas dataframe?

孤街醉人 提交于 2019-11-28 04:43:45

method 1

sample.housing.eq('yes').mul(1)

method 2

pd.Series(np.where(sample.housing.values == 'yes', 1, 0),
          sample.index)

method 3

sample.housing.map(dict(yes=1, no=0))

method 4

pd.Series(map(lambda x: dict(yes=1, no=0)[x],
              sample.housing.values.tolist()), sample.index)

method 5

pd.Series(np.searchsorted(['no', 'yes'], sample.housing.values), sample.index)

All yield

0    0
1    0
2    1
3    0
4    0
5    0
6    0
7    0
8    1
9    1

timing
given sample

timing
long sample
sample = pd.DataFrame(dict(housing=np.random.choice(('yes', 'no'), size=100000)))

Try this:

sampleDF['housing'] = sampleDF['housing'].map({'yes': 1, 'no': 0})
# produces True/False
sampleDF['housing'] = sampleDF['housing'] == 'yes'

The above returns True/False values which are essentially 1/0, respectively. Booleans support sum functions, etc. If you really need it to be 1/0 values, you can use the following.

housing_map = {'yes': 1, 'no': 0}
sampleDF['housing'] = sampleDF['housing'].map(housing_map)
%timeit
sampleDF['housing'] = sampleDF['housing'].apply(lambda x: 0 if x=='no' else 1)

1.84 ms ± 56.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Replaces 'yes' with 1, 'no' with 0 for the df column specified.

Generic way:

import pandas as pd
string_data = string_data.astype('category')
numbers_data = string_data.cat.codes

reference: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.astype.html

You can convert a series from Boolean to integer explicitly:

sampleDF['housing'] = sampleDF['housing'].eq('yes').astype(int)

Use sklearn's LabelEncoder

from sklearn.preprocessing import LabelEncoder

lb = LabelEncoder() 
sampleDF['housing'] = lb.fit_transform(sampleDF['housing'])

Source

Try the following:

sampleDF['housing'] = sampleDF['housing'].str.lower().replace({'yes': 1, 'no': 0})

The easy way to do that use pandas as below:

housing = pd.get_dummies(sampleDF['housing'],drop_first=True)

after that drop this filed from main df

sampleDF.drop('housing',axis=1,inplace=True)

now merge new one in you df

sampleDF= pd.concat([sampleDF,housing ],axis=1)

A simple and intuitive way to convert the whole dataframe to 0's and 1's might be:

sampleDF = sampleDF.replace(to_replace = "yes", value = 1)
sampleDF = sampleDF.replace(to_replace = "no", value = 0)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!