问题
I have a dataframe as follows
Name Age
0 Tom 20
1 nick 21
2
3 krish 19
4 jack 18
5
6 jill 26
7 nick
Desired output is
Name Age
0 Tom 20
1 nick 21
3 krish 19
4 jack 18
6 jill 26
7 nick
The index should not be changed and if possible would be nice if I don't have to convert empty strings to NaN. It should be removed only if all the columns have '' empty strings
回答1:
You can do:
# df.eq('') compare every cell of `df` to `''`
# .all(1) or .all(axis=1) checks if all cells on rows are True
# ~ is negate operator.
mask = ~df.eq('').all(1)
# equivalently, `ne` for `not equal`,
# mask = df.ne('').any(axis=1)
# mask is a boolean series of same length with `df`
# this is called boolean indexing, similar to numpy's
# which chooses only rows corresponding to `True`
df = df[mask]
Or in one line:
df = df[~df.eq('').all(1)]
回答2:
If they are NaN we can do dropna or we replace the empty to NaN
df.mask(df.eq('')).dropna(thresh=1)
Out[151]:
Name Age
0 Tom 20
1 nick 21
3 krish 19
4 jack 18
6 jill 26
7 nick NaN
回答3:
Empty strings are actually interpreted as False, so removing rows with only empty strings is as easy as keeping rows in which at least one field is not empty (i.e. interpreted as True) :
df[df.any(axis=1)]
or shortly
df[df.any(1)]
来源:https://stackoverflow.com/questions/61964116/delete-rows-from-pandas-dataframe-if-all-its-columns-have-empty-string