Why did my p-value equals 0 and statistic equals 1 when I use ks test in python?

我的未来我决定 提交于 2021-01-29 10:05:22

问题


Thanks to anyone who have a look first.

My codes are :

import numpy as np
from scipy.stats import kstest
data=[31001, 38502, 40842, 40852, 43007, 47228, 48320, 50500, 54545, 57437, 60126, 65556, 71215, 78460, 81299, 96851, 106472, 108398, 118495, 130832, 141678, 155703, 180689, 218032, 222238, 239553, 250895, 274025, 298231, 330228, 330910, 352058, 362993, 369690, 382487, 397270, 414179, 454013, 504993, 518475, 531767, 551032, 782483, 913658, 1432195, 1712510, 2726323, 2777535, 3996759, 13608152]
x=np.array(data)
test_sta=kstest(x, 'norm')
print(test_sta)

The result of kstest is KstestResult(statistic=1.0, pvalue=0.0). Is there anything wrong with the code or the data is just not normal at all?


回答1:


I've not used this before, but I think you're testing whether your data is standard-normal (i.e. mean=0, variance=1)

plotting a histogram shows it to be much closer to a log-normal. I'd therefore do:

x = np.log(data)
x -= np.mean(x)
x /= np.std(x)
kstest(x, 'norm')

which gives me a test statistic of 0.095 and a p-value of 0.75, confirming that we can't reject that it's not log-normal.

a good way to check this sort of thing is to generate some random data (from a known distribution) and see what the test gives you back. for example:

kstest(np.random.normal(size=100), 'norm')

gives me p-values near 1, while:

kstest(np.random.normal(loc=13, size=100), 'norm')

gives me p-values near 0.

a log-normal distribution just means that it's normally distributed after log transforming. if you really want to test against a normal distribution, you'd just not log transform the data, e.g:

x = np.array(data, dtype=float)
x -= np.mean(x)
x /= np.std(x)
kstest(x, 'norm')

which gives me a p-value of 7e-7, indicating that we can reliably reject the hypothesis that it's normally distributed.



来源:https://stackoverflow.com/questions/59022661/why-did-my-p-value-equals-0-and-statistic-equals-1-when-i-use-ks-test-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!