Python, Pandas: Calculate Spearman Correlation on 4000 observations efficiently

旧巷老猫 提交于 2020-01-14 15:05:08

问题


I have a DataFrame with 2000 rows and 4000 columns (observations). I want to calculate the spearman correlation row-wise. Currently I´m using:

df.T.corr(method="spearman")

It seems to take a very long time (20min and still not finished).

Is there a more efficient module?

Can I preprocess the DataFrame to speed things up?

UPDATE: Using scipy.stats.spearmanr is 20x faster

SCC, pval = scp.spearmanr(df, axis=1)
SCC = pd.DataFrame(SCC, index=df.index, columns=df.index)

来源:https://stackoverflow.com/questions/41088418/python-pandas-calculate-spearman-correlation-on-4000-observations-efficiently

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!