Efficient Matlab implementation of Multinomial Coefficient

问题

I want to calculate the multinomial coefficient:

where it is satisifed n=n0+n1+n2

The Matlab implementation of this operator can be easily done in the function:

function N = nchooseks(k1,k2,k3)
    N = factorial(k1+k2+k3)/(factorial(k1)*factorial(k2)*factorial(k3)); 
end

However, when the index is larger than 170, the factorial would be infinite which would generate NaN in some cases, e.g. 180!/(175! 3! 2!) -> Inf/Inf-> NaN.

In other posts, they have solved this overflow issue for C and Python.

In the case of C: "you can make lists out of all the factorials, then find the prime factorization of all the numbers in the lists, then cancel all the numbers on the top with those on the bottom, until the numbers are completely reduced".
In the case of Python: "make use of the fact that factorial(n) = gamma(n+1), use the logarithm of the gamma function and use additions instead of multiplications, subtractions instead of divisions".

The first solution seems extremely slow, so I have tried the second option:

function N = nchooseks(k1,k2,k3)
    N = 10^(log_gamma(k1+k2+k3)-(log_gamma(k1)+log_gamma(k2)+log_gamma(k3))); 
end
function y = log_gamma(x),  y = log10(gamma(x+1));  end

I compare the original and log_gamma implementation with the following code:

% Calculate
N=100; err = zeros(N,N,N);
for n1=1:N,
    for n2=1:N,
        for n3=1:N,
            N1 = factorial(n1+n2+n3)/(factorial(n1)*factorial(n2)*factorial(n3)); 
            N2 = 10^(log10(gamma(n1+n2+n3+1))-(log10(gamma(n1+1))+log10(gamma(n2+1))+log10(gamma(n3+1)))); 
            err(n1,n2,n3) = abs(N1-N2); 
        end
    end
end
% Plot histogram of errors
err_ = err(~isnan(err));
[nelements,centers] = hist(err_(:),1e2);
figure; bar(centers,nelements./numel(err_(:)));

However, the results are slightly different for some cases, as presented in the following histogram.

Thus, should I assume that my implementation is correct or the numerical error does not justify the number divergence?

回答1:

Why not use this? It's fast and doesn't suffer from overflow:

N = prod([1:n]./[1:n0 1:n1 1:n2]);

回答2:

Sorry to resurrect an old post, but for future searchers, you should almost certainly just write your multinomial coefficient as a product of binomial coefficients and use a built-in method to compute binomial coefficients (or write your own, either using Pascal's triangle or another method). The relevant formula appears in the first paragraph of the Wikipedia section on multinomial coefficients. (I'd write it here, but there doesn't seem to be a way to render LaTeX.)

Another benefit of this approach is that it's as good as you can possibly get about overflow since the factors are all integers. There's no intrinsic need to divide when computing multinomial coefficients.

回答3:

using the tip provided by @jemidiah,

and here is the code

function c = multicoeff (k), 
    c = 1; 
    for i=1:length(k), 
      c = c* bincoeff(sum(k(1:i)),k(i)); 
    end; 
end

and some usage examples:

octave:88> multicoeff([2 2 2])
ans =  90
octave:89> factorial(6)/(factorial(2)*factorial(2)*factorial(2))
ans =  90
octave:90> multicoeff([5 4 3])
ans =  27720
octave:91> factorial(12)/(factorial(5)*factorial(4)*factorial(3))
ans =  27720

来源：https://stackoverflow.com/questions/22301115/efficient-matlab-implementation-of-multinomial-coefficient

标签

matlab

overflow

factorial

multinomial