问题
I'm trying to create a new column in a dataframe that labels animals that are domesticated with a 1. I'm using a for loop, but for some reason, the loop only picks up the last item in the pets list. dog, cat, and gerbil should all be assigned a 1 under the domesticated column. Anyone have a fix for this or a better approach?
df = pd.DataFrame(
{'creature': ['dog', 'cat', 'gerbil', 'mouse', 'donkey']
})
pets = ['dog', 'cat', 'gerbil']
for pet in pets:
df['domesticated'] = np.where(df['creature']==pet, 1, 0)
df
回答1:
You are setting all non gerbil to 0 in your last loop iteration. That is, when pet is gerbil in your last iteration, ALL entries that are not equal to gerbil will correspond to 0. This includes entries that are dog or cat. You should check all values in pets at once. Try this:
df['domesticated'] = df['creature'].apply(lambda x: 1 if x in pets else 0)
If you want to stick with np.where:
df['domesticated'] = np.where(df['creature'].isin(pets), 1, 0)
回答2:
The problem is every loop resets your results.
df['domesticated'] = df.isin(pets).astype(int)
creature domesticated
0 dog 1
1 cat 1
2 gerbil 1
3 mouse 0
4 donkey 0
来源:https://stackoverflow.com/questions/55271517/for-loop-using-np-where