Covariance matrix from np.polyfit() has negative diagonal?

社会主义新天地 提交于 2019-12-12 09:27:23

问题


Problem: the cov=True option of np.polyfit() produces a diagonal with non-sensical negative values.

UPDATE: after playing with this some more, I am really starting to suspect a bug in numpy? Is that possible? Deleting any pair of 13 values from the dataset will fix the problem.

I am using np.polyfit() to calculate the slope and intercept coefficients of a dataset. A plot of the values produces a very linear (but not perfectly) linear graph. I am attempting to get the standard deviation on these coefficients with np.sqrt(np.diag(cov)); however, this throws an error because the diagonal contains negative values.

It should be mathematically impossible to produce a covariate matrix with a negative diagonal--what is numpy doing wrong?

Here is a snippet that reproduces the problem:

import numpy as np

x = [1476728821.797, 1476728821.904, 1476728821.911, 1476728821.920, 1476728822.031, 1476728822.039,
     1476728822.047, 1476728822.153, 1476728822.162, 1476728822.171, 1476728822.280, 1476728822.289,
     1476728822.297, 1476728822.407, 1476728822.416, 1476728822.423, 1476728822.530, 1476728822.539,
     1476728822.547, 1476728822.657, 1476728822.666, 1476728822.674, 1476728822.759, 1476728822.788,
     1476728822.797, 1476728822.805, 1476728822.915, 1476728822.923, 1476728822.931, 1476728823.038,
     1476728823.047, 1476728823.054, 1476728823.165, 1476728823.175, 1476728823.182, 1476728823.292,
     1476728823.300, 1476728823.308, 1476728823.415, 1476728823.424, 1476728823.432, 1476728823.551,
     1476728823.559, 1476728823.567, 1476728823.678, 1476728823.689, 1476728823.697, 1476728823.808,
     1476728823.828, 1476728823.837, 1476728823.947, 1476728823.956, 1476728823.964, 1476728824.074,
     1476728824.083, 1476728824.091, 1476728824.201, 1476728824.209, 1476728824.217, 1476728824.324,
     1476728824.333, 1476728824.341, 1476728824.451, 1476728824.460, 1476728824.468, 1476728824.579,
     1476728824.590, 1476728824.598, 1476728824.721, 1476728824.730, 1476728824.788]

y = [6309927, 6310105, 6310116, 6310125, 6310299, 6310317, 6310326, 6310501, 6310513, 6310523, 6310688,
     6310703, 6310712, 6310875, 6310891, 6310900, 6311058, 6311069, 6311079, 6311243, 6311261, 6311272,
     6311414, 6311463, 6311479, 6311490, 6311665, 6311683, 6311692, 6311857, 6311867, 6311877, 6312037,
     6312054, 6312065, 6312230, 6312248, 6312257, 6312430, 6312442, 6312455, 6312646, 6312665, 6312675,
     6312860, 6312879, 6312894, 6313071, 6313103, 6313117, 6313287, 6313304, 6313315, 6313489, 6313505,
     6313518, 6313675, 6313692, 6313701, 6313875, 6313888, 6313898, 6314076, 6314093, 6314104, 6314285,
     6314306, 6314321, 6314526, 6314541, 6314638]

z, cov = np.polyfit(np.asarray(x), np.asarray(y), 1, cov=True)

std = np.sqrt(np.diag(cov))

print z
print cov
print std

回答1:


It looks like it's related to your x values: they have a total range of about 3, with an offset of about 1.5 billion.

In your code

np.asarray(x)

converts the x values in a ndarray of float64. While this is fine to correctly represent the x values themselves, it might not be enough to carry on the required computations to get the covariance matrix.

np.asarray(x, dtype=np.float128)

would solve the problem, but polyfit can't work with float128 :(

TypeError: array type float128 is unsupported in linalg

As a workaround, you can subtract the offset from x and then using polyfit. This produces a covariance matrix with positive diagonal:

x1 = x - np.mean(x)
z1, cov1 = np.polyfit(np.asarray(x1), np.asarray(y), 1, cov=True)
std1 = np.sqrt(np.diag(cov1))

print z1    # prints: array([  1.56607841e+03,   6.31224162e+06])
print cov1  # prints: array([[  4.56066546e+00,  -2.90980285e-07],
            #                [ -2.90980285e-07,   3.36480951e+00]])
print std1  # prints: array([ 2.13557146,  1.83434171])

You'll have to rescale the results accordingly.



来源:https://stackoverflow.com/questions/40095325/covariance-matrix-from-np-polyfit-has-negative-diagonal

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!