问题
I have a dataframe like this,
df,
Name City
0 sri chennai
1 pedhci pune
2 bahra pune
there is a duplicate in City column.
I tried:
df["City"].drop_duplicates()
but it gives only the particular column.
my desired output should be
output_df
Name City
0 sri chennai
1 pedhci pune
回答1:
You can use:
df2 = df.drop_duplicates(subset='City')
if you want to store the result in a new dataframe, or:
df.drop_duplicates(subset='City',inplace=True)
if you want to update df
.
This produces:
>>> df
City Name
0 chennai sri
1 pune pedhci
2 pune bahra
>>> df.drop_duplicates(subset='City')
City Name
0 chennai sri
1 pune pedhci
This will thus only take duplicates for City
into account, duplicates in Name
are ignored.
来源:https://stackoverflow.com/questions/45522478/how-to-remove-entire-column-if-a-particular-row-has-duplicate-values-in-a-datafr