问题
I want to treat non aligned or missing (NaN, Inf, -Inf) values as 1s or 0s.
df1 = pd.DataFrame({"x":[1, 2, 3, 4, 5],
"y":[3, 4, 5, 6, 7]},
index=['a', 'b', 'c', 'd', 'e'])
df2 = pd.DataFrame({"y":[1, NaN, 3, 4, 5],
"z":[3, 4, 5, 6, 7]},
index=['b', 'c', 'd', 'e', 'f'])
Above code results in the following
df1 * df2
x y z
a NaN NaN NaN
b NaN 4.0 NaN
c NaN NaN NaN
d NaN 18.0 NaN
e NaN 28.0 NaN
f NaN NaN NaN
I want to ignore NaNs and also treat non aligned values as 1s in either the left or right DF or both.
E.g.
Case 1: Replace missing or misaligned value in df1
with 1
df1 * df2
x y z
a 1 3 NaN
b 2 4.0 NaN
c 3 5 NaN
d 4 18.0 NaN
e 5 28.0 NaN
f NaN NaN NaN
Case 2: Replace missing or misaligned value in df2
with 1
df1 * df2
x y z
a NaN NaN NaN
b NaN 4.0 3
c NaN NaN 4
d NaN 18.0 5
e NaN 28.0 6
f NaN 5 7
Case 3: Replace any missing or misaligned value with 1 if there is a value in the other DF.
df1 * df2
x y z
a 1 3 NaN
b 2 4.0 3
c 3 5 4
d 4 18.0 5
e 5 28.0 6
f NaN 5 7
In the case of addison, I want to treat the missing or miss aligned values as 0s.
回答1:
I think you need DataFrame.mul with fillna or combine_first in solution 1
and 2
:
print (df1.mul(df2).fillna(df1))
x y z
a 1.0 3.0 NaN
b 2.0 4.0 NaN
c 3.0 5.0 NaN
d 4.0 18.0 NaN
e 5.0 28.0 NaN
f NaN NaN NaN
print (df1.mul(df2).combine_first(df1))
x y z
a 1.0 3.0 NaN
b 2.0 4.0 NaN
c 3.0 5.0 NaN
d 4.0 18.0 NaN
e 5.0 28.0 NaN
f NaN NaN NaN
print (df1.mul(df2).fillna(df2))
x y z
a NaN NaN NaN
b NaN 4.0 3.0
c NaN NaN 4.0
d NaN 18.0 5.0
e NaN 28.0 6.0
f NaN 5.0 7.0
print (df1.mul(df2).combine_first(df2))
x y z
a NaN NaN NaN
b NaN 4.0 3.0
c NaN NaN 4.0
d NaN 18.0 5.0
e NaN 28.0 6.0
f NaN 5.0 7.0
Solution with fill_value=1
in DataFrame.mul for 3
output:
print (df1.mul(df2, fill_value=1))
x y z
a 1.0 3.0 NaN
b 2.0 4.0 3.0
c 3.0 5.0 4.0
d 4.0 18.0 5.0
e 5.0 28.0 6.0
f NaN 5.0 7.0
回答2:
Case 1 Replace missing or misaligned value in df1 with 1
>>> df1.reindex(index=df1.index.union(df2.index),
columns=df1.columns.union(df2.columns)).fillna(1)
x y z
a 1 3 1
b 2 4 1
c 3 5 1
d 4 6 1
e 5 7 1
f 1 1 1
Append the snippet above with .mul(df2)
if desired.
Case 2 Replace missing or misaligned value in df2 with 1
>>> df2.reindex(index=df2.index.union(df1.index),
columns=df2.columns.union(df1.columns)).fillna(1)
x y z
a 1 1 1
b 1 1 3
c 1 1 4
d 1 3 5
e 1 4 6
f 1 5 7
Append the snippet above with .mul(df1)
if desired.
Case 3 Replace any missing or misaligned value with 1 if there is a value in the other DF.
>>> df1.mul(df2).combine_first(df1).combine_first(df2)
x y z
a 1 3 NaN
b 2 4 3
c 3 5 4
d 4 18 5
e 5 28 6
f NaN 5 7
来源:https://stackoverflow.com/questions/45533971/how-to-treat-nan-or-non-aligned-values-as-1s-or-0s-in-multiplying-pandas-datafra