How to determine probability of 0 to N events occurring given probability of each of those N events?

问题

First time posting here, so if I make a mistake with something let me know and I'd be more than happy to fix it!

Given N events, each of which have an individual probability (from 0 to 100%) of occurring, I'd like to determine the probability of 0 to N of those events occurring together.

For example, if I have event 1, 2, 3,...,N and 5 (E1, E2, E3...,EN) where the individual probability of a specific event occurring is as follows:

E1 = 30% probability of occurring
E2 = 40% probability of occurring
E3 = 50% probability of occurring
...
EN = x% probability of occurring

I'd like to know the probability of having:

none of these events occurring
any 1 of these events occurring
any 2 of these events occurring
any 3 of these events occurring
...
all N of these events occurring

I understand that having 0 events occurring is (1-E1)(1-E2)...(1-EN) and that having all N events occurring is E1*E2*...*E3. However, I do not know how to calculate the other possibilities (1 to N-1 events occurring).

I have been looking for some recursive algorithm (binomial compound distribution) that could solve this but I have not found any explicit formula that does this. Wondering if any of you guys could help!

Thanks in advance!

EDIT: The events are indeed independent.

回答1:

Sounds like Poisson binomial wikipedia link.

There's an explicit recursive formula but beware of numerical stability.

where

回答2:

Something like the following recursive program should work.

function ans = probability_vector(probabilities)
    if len(probabilities) == 0
        % No events can happen.
        ans = [1];
    elseif len(probabilities) == 1
        % 0 or 1 events can happen.
        ans = [1 - probabilities[1], probabilities[1]];
    else
        half = ceil(len(probabilities)/2);
        ans_half1 = probability_vector(probabilities[1: half]);
        ans_half2 = probability_vector(probabilities[half + 1: end]);
        ans = convolve(ans_half1, ans_half2)
    end
    return
end

And if p is a probability vector, then p[i+1] is the probability of i of the events happening.

See http://matlabtricks.com/post-3/the-basics-of-convolution for an explanation of the magic conv operator that does the meat of the work.

回答3:

You need to compute your own version of Pascal's triangle, with probabilities (instead of counts) in each location. Row 0 will be the single figure 1.00; row 1 consists of two values, P(E1) and 1-P(E1). Below that, in row k, each position is P(Ek)[above-right entry] + (1-P(Ek))[above-left entry]. I recommend a lower-triangular matrix for this, something like:

1.00
0.30 0.70
0.12 0.46 0.42  # These are 0.3*0.4 | 0.3*0.6 + 0.7*0.4 | 0.7*0.6
0.06 0.29 0.44 0.21  # 0.12*0.5 | 0.12*0.5 + 0.46*0.5 | ...

See how it works? In array / matrix notation for a matrix M, given event probabilities in vector P, this looks something like

M[k, i] = P[k] * M[k-1, i] +
          (1-P[k]) * M[k-1, i] + P[k] * M[k-1, i-1]

The above is a nice recursive definition. Note that my earlier "above-right" reference in the lower-matrix notation is simply a row above; above-left is exactly that: row k-1, column i-1.

When you're done, the bottom row of the matrix will be the probabilities of getting N, N-1, N-2, ... 0 of the events. If you want these probabilities in the opposite order, then simply switch the coefficients P[k] and 1-P[k]

Does that get you moving toward a solution?

回答4:

After tons of research and some help from the answers here, I've come up with the following code:

function [ prob_numSites ] = probability_activationSite( prob_distribution_site )

N = length(prob_distribution_site); % number of events
notProb = 1 - prob_distribution_site; % find probability of no occurrence
syms x; % create symbolic variable
prob_number = 1; % initializing prob_number to 1

for i = 1:N
    prob_number = prob_number*(prob_distribution_site(i)*x + notProb(i));
end

prob_number_polynomial = expand(prob_number); % expands the function into a polynomial
revProb_numSites = coeffs(prob_number_polynomial); % returns the coefficients of the above polynomial (ie probability of 0 to N events, where first coefficient is N events occurring, last coefficient is 0 events occurring)
prob_numSites = fliplr(revProb_numSites); % reverses order of coefficients

This takes in probability of certain number of individual events occurring and returns array of the probability of 0 to N events occurring.

(This answer helped a lot).

来源：https://stackoverflow.com/questions/37261642/how-to-determine-probability-of-0-to-n-events-occurring-given-probability-of-eac

标签

algorithm

matlab

recursion

probability