How to treat NaN or non aligned values as 1s or 0s in multiplying pandas DataFrames

旧街凉风 提交于 2019-12-13 13:22:03

问题


I want to treat non aligned or missing (NaN, Inf, -Inf) values as 1s or 0s.

df1 = pd.DataFrame({"x":[1, 2, 3, 4, 5], 
    "y":[3, 4, 5, 6, 7]}, 
    index=['a', 'b', 'c', 'd', 'e'])

df2 = pd.DataFrame({"y":[1, NaN, 3, 4, 5], 
    "z":[3, 4, 5, 6, 7]}, 
    index=['b', 'c', 'd', 'e', 'f'])

Above code results in the following

df1 * df2
    x     y   z
a NaN   NaN NaN
b NaN   4.0 NaN
c NaN   NaN NaN
d NaN  18.0 NaN
e NaN  28.0 NaN
f NaN   NaN NaN

I want to ignore NaNs and also treat non aligned values as 1s in either the left or right DF or both.

E.g.

Case 1: Replace missing or misaligned value in df1 with 1

df1 * df2
    x     y   z
a   1     3 NaN
b   2   4.0 NaN
c   3     5 NaN
d   4  18.0 NaN
e   5  28.0 NaN
f NaN   NaN NaN

Case 2: Replace missing or misaligned value in df2 with 1

df1 * df2
    x     y   z
a NaN   NaN NaN
b NaN   4.0   3
c NaN   NaN   4
d NaN  18.0   5
e NaN  28.0   6
f NaN     5   7

Case 3: Replace any missing or misaligned value with 1 if there is a value in the other DF.

df1 * df2
    x     y   z
a   1     3 NaN
b   2   4.0   3
c   3     5   4
d   4  18.0   5
e   5  28.0   6
f NaN     5   7

In the case of addison, I want to treat the missing or miss aligned values as 0s.


回答1:


I think you need DataFrame.mul with fillna or combine_first in solution 1 and 2:

print (df1.mul(df2).fillna(df1))
     x     y   z
a  1.0   3.0 NaN
b  2.0   4.0 NaN
c  3.0   5.0 NaN
d  4.0  18.0 NaN
e  5.0  28.0 NaN
f  NaN   NaN NaN

print (df1.mul(df2).combine_first(df1))
     x     y   z
a  1.0   3.0 NaN
b  2.0   4.0 NaN
c  3.0   5.0 NaN
d  4.0  18.0 NaN
e  5.0  28.0 NaN
f  NaN   NaN NaN

print (df1.mul(df2).fillna(df2))
    x     y    z
a NaN   NaN  NaN
b NaN   4.0  3.0
c NaN   NaN  4.0
d NaN  18.0  5.0
e NaN  28.0  6.0
f NaN   5.0  7.0

print (df1.mul(df2).combine_first(df2))
    x     y    z
a NaN   NaN  NaN
b NaN   4.0  3.0
c NaN   NaN  4.0
d NaN  18.0  5.0
e NaN  28.0  6.0
f NaN   5.0  7.0

Solution with fill_value=1 in DataFrame.mul for 3 output:

print (df1.mul(df2, fill_value=1))
     x     y    z
a  1.0   3.0  NaN
b  2.0   4.0  3.0
c  3.0   5.0  4.0
d  4.0  18.0  5.0
e  5.0  28.0  6.0
f  NaN   5.0  7.0



回答2:


Case 1 Replace missing or misaligned value in df1 with 1

>>> df1.reindex(index=df1.index.union(df2.index), 
                columns=df1.columns.union(df2.columns)).fillna(1)
   x  y  z
a  1  3  1
b  2  4  1
c  3  5  1
d  4  6  1
e  5  7  1
f  1  1  1

Append the snippet above with .mul(df2) if desired.

Case 2 Replace missing or misaligned value in df2 with 1

>>> df2.reindex(index=df2.index.union(df1.index), 
                columns=df2.columns.union(df1.columns)).fillna(1)
   x  y  z
a  1  1  1
b  1  1  3
c  1  1  4
d  1  3  5
e  1  4  6
f  1  5  7

Append the snippet above with .mul(df1) if desired.

Case 3 Replace any missing or misaligned value with 1 if there is a value in the other DF.

>>> df1.mul(df2).combine_first(df1).combine_first(df2)
    x   y   z
a   1   3 NaN
b   2   4   3
c   3   5   4
d   4  18   5
e   5  28   6
f NaN   5   7


来源:https://stackoverflow.com/questions/45533971/how-to-treat-nan-or-non-aligned-values-as-1s-or-0s-in-multiplying-pandas-datafra

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!