I\'m coming to Python from R and trying to reproduce a number of things that I\'m used to doing in R using Python. The Matrix library for R has a very nifty function called
This is perhaps a silly extension to DomPazz answer to consider both correlation and covariance matrices. It also has an early termination if you are dealing with a large number of matrices.
def near_psd(x, epsilon=0):
'''
Calculates the nearest postive semi-definite matrix for a correlation/covariance matrix
Parameters
----------
x : array_like
Covariance/correlation matrix
epsilon : float
Eigenvalue limit (usually set to zero to ensure positive definiteness)
Returns
-------
near_cov : array_like
closest positive definite covariance/correlation matrix
Notes
-----
Document source
http://www.quarchome.org/correlationmatrix.pdf
'''
if min(np.linalg.eigvals(x)) > epsilon:
return x
# Removing scaling factor of covariance matrix
n = x.shape[0]
var_list = np.array([np.sqrt(x[i,i]) for i in xrange(n)])
y = np.array([[x[i, j]/(var_list[i]*var_list[j]) for i in xrange(n)] for j in xrange(n)])
# getting the nearest correlation matrix
eigval, eigvec = np.linalg.eig(y)
val = np.matrix(np.maximum(eigval, epsilon))
vec = np.matrix(eigvec)
T = 1/(np.multiply(vec, vec) * val.T)
T = np.matrix(np.sqrt(np.diag(np.array(T).reshape((n)) )))
B = T * vec * np.diag(np.array(np.sqrt(val)).reshape((n)))
near_corr = B*B.T
# returning the scaling factors
near_cov = np.array([[near_corr[i, j]*(var_list[i]*var_list[j]) for i in xrange(n)] for j in xrange(n)])
return near_cov