问题
I have two time series data, columns A and B.
I am computing rolling moving averages of different duration on column A. For example (5,10,15,20).
I want to assign weights to each of these average columns so that the sumproduct of weights and average columns has maximum correlation with column B. In other words, how to implement excel like optimization in Python.
Please have a look at the sample code and suggest the way forward.
import pandas as pd
import numpy as np
dates = pd.date_range('20130101', periods=100)
df = pd.DataFrame(np.random.randn(100, 2), index=dates, columns=list('AB'))
df['sma_5']=df['A'].rolling(5).mean()
df['sma_10']=df['A'].rolling(10).mean()
df['sma_15']=df['A'].rolling(15).mean()
df['sma_20']=df['A'].rolling(20).mean()
w=[0.25,0.25,0.25,0.25]
df['B_friend'']=
w[0]*df['sma_5']+w[1]*df['sma_10']+w[2]*df['sma_15']+w[3]*df['sma_20']
Need to optimize the weights 'w' to maximize the correlation.
df['B'].corr(df['B_friend'])
Thanks in advance.
回答1:
scipy.optimize.minimize
function looks like what you need: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html#scipy.optimize.minimize
The code would look something like this:
import pandas as pd
import numpy as np
import scipy.optimize as opt
dates = pd.date_range('20130101', periods=100)
df = pd.DataFrame(np.random.randn(100, 2), index=dates, columns=list('AB'))
df['sma_5']=df['A'].rolling(5).mean()
df['sma_10']=df['A'].rolling(10).mean()
df['sma_15']=df['A'].rolling(15).mean()
df['sma_20']=df['A'].rolling(20).mean()
def fun(x):
w = x
B_friend=w[0]*df['sma_5']+w[1]*df['sma_10']+w[2]*df['sma_15']+w[3]*df['sma_20']
# -np.abs(corr) instead of just corrr is used
# in order to turn a maximization problem into a
# minimization problem
return -np.abs(df['B'].corr(B_friend))
w=[0.25,0.25,0.25,0.25]
opt.minimize(fun, w)
来源:https://stackoverflow.com/questions/54686052/calculating-optimized-weights-to-maximize-correlation