Generate random numbers with a given (numerical) distribution

前端 未结 13 2303
我寻月下人不归
我寻月下人不归 2020-11-22 11:18

I have a file with some probabilities for different values e.g.:

1 0.1
2 0.05
3 0.05
4 0.2
5 0.4
6 0.2

I would like to generate random numb

13条回答
  •  青春惊慌失措
    2020-11-22 11:54

    I wrote a solution for drawing random samples from a custom continuous distribution.

    I needed this for a similar use-case to yours (i.e. generating random dates with a given probability distribution).

    You just need the funtion random_custDist and the line samples=random_custDist(x0,x1,custDist=custDist,size=1000). The rest is decoration ^^.

    import numpy as np
    
    #funtion
    def random_custDist(x0,x1,custDist,size=None, nControl=10**6):
        #genearte a list of size random samples, obeying the distribution custDist
        #suggests random samples between x0 and x1 and accepts the suggestion with probability custDist(x)
        #custDist noes not need to be normalized. Add this condition to increase performance. 
        #Best performance for max_{x in [x0,x1]} custDist(x) = 1
        samples=[]
        nLoop=0
        while len(samples)=0 and prop<=1
            if np.random.uniform(low=0,high=1) <=prop:
                samples += [x]
            nLoop+=1
        return samples
    
    #call
    x0=2007
    x1=2019
    def custDist(x):
        if x<2010:
            return .3
        else:
            return (np.exp(x-2008)-1)/(np.exp(2019-2007)-1)
    samples=random_custDist(x0,x1,custDist=custDist,size=1000)
    print(samples)
    
    #plot
    import matplotlib.pyplot as plt
    #hist
    bins=np.linspace(x0,x1,int(x1-x0+1))
    hist=np.histogram(samples, bins )[0]
    hist=hist/np.sum(hist)
    plt.bar( (bins[:-1]+bins[1:])/2, hist, width=.96, label='sample distribution')
    #dist
    grid=np.linspace(x0,x1,100)
    discCustDist=np.array([custDist(x) for x in grid]) #distrete version
    discCustDist*=1/(grid[1]-grid[0])/np.sum(discCustDist)
    plt.plot(grid,discCustDist,label='custom distribustion (custDist)', color='C1', linewidth=4)
    #decoration
    plt.legend(loc=3,bbox_to_anchor=(1,0))
    plt.show()
    

    The performance of this solution is improvable for sure, but I prefer readability.

提交回复
热议问题