Compute Euclidean distance between rows of two pandas dataframes

二次信任 提交于 2019-12-17 20:46:58

问题


I have two pandas dataframes d1 and d2 that look like these:

d1 looks like:

  output   value1   value2   value2
    1           100     103      87
    1           201     97.5     88.9
    1           144     54       85

d2 looks like:

 output   value1   value2   value2
    0           100     103      87
    0           201     97.5     88.9
    0           144     54       85
    0           100     103      87
    0           201     97.5     88.9
    0           144     54       85

The column output has a value of 1 for all rows in d1 and 0 for all rows in d2. It's a grouping variable. I need to find euclidean distance between each rows of d1 and d2 (not within d1 or d2). If d1 has m rows and d2 has n rows, then the distance matrix will have m rows and n columns


回答1:


By using scipy.spatial.distance.cdist:

import scipy

ary = scipy.spatial.distance.cdist(d1.iloc[:,1:], d2.iloc[:,1:], metric='euclidean')

pd.DataFrame(ary)
Out[1274]: 
            0           1          2           3           4          5
0    0.000000  101.167485  65.886266    0.000000  101.167485  65.886266
1  101.167485    0.000000  71.808495  101.167485    0.000000  71.808495
2   65.886266   71.808495   0.000000   65.886266   71.808495   0.000000


来源:https://stackoverflow.com/questions/47782104/compute-euclidean-distance-between-rows-of-two-pandas-dataframes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!