问题
First time posting here, so if I make a mistake with something let me know and I'd be more than happy to fix it!
Given N events, each of which have an individual probability (from 0 to 100%) of occurring, I'd like to determine the probability of 0 to N of those events occurring together.
For example, if I have event 1, 2, 3,...,N and 5 (E1, E2, E3...,EN) where the individual probability of a specific event occurring is as follows:
- E1 = 30% probability of occurring
- E2 = 40% probability of occurring
E3 = 50% probability of occurring
...
EN = x% probability of occurring
I'd like to know the probability of having:
- none of these events occurring
- any 1 of these events occurring
- any 2 of these events occurring
- any 3 of these events occurring
- ...
- all N of these events occurring
I understand that having 0 events occurring is (1-E1)(1-E2)...(1-EN) and that having all N events occurring is E1*E2*...*E3. However, I do not know how to calculate the other possibilities (1 to N-1 events occurring).
I have been looking for some recursive algorithm (binomial compound distribution) that could solve this but I have not found any explicit formula that does this. Wondering if any of you guys could help!
Thanks in advance!
EDIT: The events are indeed independent.
回答1:
Sounds like Poisson binomial wikipedia link.
There's an explicit recursive formula but beware of numerical stability.
where
回答2:
Something like the following recursive program should work.
function ans = probability_vector(probabilities)
if len(probabilities) == 0
% No events can happen.
ans = [1];
elseif len(probabilities) == 1
% 0 or 1 events can happen.
ans = [1 - probabilities[1], probabilities[1]];
else
half = ceil(len(probabilities)/2);
ans_half1 = probability_vector(probabilities[1: half]);
ans_half2 = probability_vector(probabilities[half + 1: end]);
ans = convolve(ans_half1, ans_half2)
end
return
end
And if p
is a probability vector, then p[i+1]
is the probability of i
of the events happening.
See http://matlabtricks.com/post-3/the-basics-of-convolution for an explanation of the magic conv
operator that does the meat of the work.
回答3:
You need to compute your own version of Pascal's triangle, with probabilities (instead of counts) in each location. Row 0 will be the single figure 1.00; row 1 consists of two values, P(E1) and 1-P(E1). Below that, in row k, each position is P(Ek)[above-right entry] + (1-P(Ek))[above-left entry]. I recommend a lower-triangular matrix for this, something like:
1.00
0.30 0.70
0.12 0.46 0.42 # These are 0.3*0.4 | 0.3*0.6 + 0.7*0.4 | 0.7*0.6
0.06 0.29 0.44 0.21 # 0.12*0.5 | 0.12*0.5 + 0.46*0.5 | ...
See how it works? In array / matrix notation for a matrix M, given event probabilities in vector P, this looks something like
M[k, i] = P[k] * M[k-1, i] +
(1-P[k]) * M[k-1, i] + P[k] * M[k-1, i-1]
The above is a nice recursive definition. Note that my earlier "above-right" reference in the lower-matrix notation is simply a row above; above-left is exactly that: row k-1, column i-1.
When you're done, the bottom row of the matrix will be the probabilities of getting N, N-1, N-2, ... 0 of the events. If you want these probabilities in the opposite order, then simply switch the coefficients P[k] and 1-P[k]
Does that get you moving toward a solution?
回答4:
After tons of research and some help from the answers here, I've come up with the following code:
function [ prob_numSites ] = probability_activationSite( prob_distribution_site )
N = length(prob_distribution_site); % number of events
notProb = 1 - prob_distribution_site; % find probability of no occurrence
syms x; % create symbolic variable
prob_number = 1; % initializing prob_number to 1
for i = 1:N
prob_number = prob_number*(prob_distribution_site(i)*x + notProb(i));
end
prob_number_polynomial = expand(prob_number); % expands the function into a polynomial
revProb_numSites = coeffs(prob_number_polynomial); % returns the coefficients of the above polynomial (ie probability of 0 to N events, where first coefficient is N events occurring, last coefficient is 0 events occurring)
prob_numSites = fliplr(revProb_numSites); % reverses order of coefficients
This takes in probability of certain number of individual events occurring and returns array of the probability of 0 to N events occurring.
(This answer helped a lot).
来源:https://stackoverflow.com/questions/37261642/how-to-determine-probability-of-0-to-n-events-occurring-given-probability-of-eac