How to use python to separate two gaussian curves?

最后都变了- 提交于 2021-02-19 01:34:45

问题


I measured the fluorescence intensity of thousands of particles and made the histogram, which showed two adjacent gaussian curves. How to use python or its package to separate them into two Gaussian curves and make two new plots?

Thank you.


回答1:


Basically, you need to infer parameters for your Gaussian mixture. I will generate a similar dataset for the illustration.

Generating mixtures with known parameters

from itertools import starmap

import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import mlab
sns.set(color_codes=True)
# inline plots in jupyter notebook
%matplotlib inline


# generate synthetic data from a mixture of two Gaussians with equal weights
# the solution below readily generalises to more components 
nsamples = 10000
means = [30, 120]
sds = [10, 50]
weights = [0.5, 0.5]
draws = np.random.multinomial(nsamples, weights)
samples = np.concatenate(
    list(starmap(np.random.normal, zip(means, sds, draws)))
)

Plot the distribution

sns.distplot(samples)

Infer parameters

from sklearn.mixture import GaussianMixture

mixture = GaussianMixture(n_components=2).fit(samples.reshape(-1, 1))
means_hat = mixture.means_.flatten()
weights_hat = mixture.weights_.flatten()
sds_hat = np.sqrt(mixture.covariances_).flatten()

print(mixture.converged_)
print(means_hat)
print(sds_hat)
print(weights_hat)

We get:

True
[ 122.57524745   29.97741112]
[ 48.18013893  10.44561398]
[ 0.48559771  0.51440229]

You can tweak GaussianMixture's hyper-parameters to improve fit, but this looks fine enough. Now we can plot each component (I'm only plotting the first one):

mu1_h, sd1_h = means_hat[0], sds_hat[0]
x_axis = np.linspace(mu1_h-3*sd1_h, mu1_h+3*sd1_h, 1000)
plt.plot(x_axis, mlab.normpdf(x_axis, mu1_h, sd1_h))

P.S.

On a sidenote. It seems like you are dealing with constrained data, and your observations are pretty close to the left constraint (zero). While Gaussians might approximate your data well enough, you should tread carefully, because Gaussians assume unconstrained geometry.



来源:https://stackoverflow.com/questions/51318981/how-to-use-python-to-separate-two-gaussian-curves

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!