问题
So I have lists of floats. Like [1.33,2.555,3.2134,4.123123] etc. Those lists are mean frequencies of something. How do I proof that two lists are different? I thought about calculating p-value. Is there a function to do that? I looked through scipy documentation, but couldn't figure out what to use.
Can anyone please advice?
回答1:
Let's say you have a list of floats like this:
>>> data = {
... 'a': [0.9, 1.0, 1.1, 1.2],
... 'b': [0.8, 0.9, 1.0, 1.1],
... 'c': [4.9, 5.0, 5.1, 5.2],
... }
Clearly, a is very similar to b, but both are different from c.
There are two kinds of comparisons you may want to do.
- Pairwise: Is
asimilar tob? Isasimilar toc? Isbsimilar toc? - Combined: Are
a,bandcdrawn from the same group? (This is generally a better question)
The former can be achieved using independent t-tests as follows:
>>> from itertools import combinations
>>> from scipy.stats import ttest_ind
>>> for list1, list2 in combinations(data.keys(), 2):
... t, p = ttest_ind(data[list1], data[list2])
... print list1, list2, p
...
a c 9.45895002589e-09
a b 0.315333596201
c b 8.15963804843e-09
This provides the relevant p-values, and implies that that a and c are
different, b and c are different, but a and b may be similar.
The latter can be achieved using the one-way ANOVA as follows:
>>> from scipy.stats import f_oneway
>>> t, p = f_oneway(*data.values())
>>> p
7.959305946160327e-12
The p-value indicates that a, b, and c are unlikely to be from the same population.
来源:https://stackoverflow.com/questions/29561360/how-to-calculate-p-value-for-two-lists-of-floats