probability

Computationally simple pseudo-Gaussian distribution with varying mean and standard deviation?

情到浓时终转凉″ 提交于 2019-12-06 05:49:17
This picture from Wikipedia has a nice example of the sort of functions I'd ideally like to generate: Right now I'm using the Irwin-Hall Distribution, which is more or less a polynomial approximation of the Gaussian distribution...basically, you use uniform random number generator and iterate it x times, and take the average. The more iterations, the more like a Gaussian Distribution it is. It's pretty nice; however I'd like to be able to have one where I can vary the mean. For example, let's say I wanted a number between the range 0 and 10, but around 7. Like, the mean (if I repeated this

Probabilty heatmap in ggplot

亡梦爱人 提交于 2019-12-06 05:19:22
问题 I asked this question a year ago and got code for this "probability heatmap": numbet <- 32 numtri <- 1e5 prob=5/6 #Fill a matrix xcum <- matrix(NA, nrow=numtri, ncol=numbet+1) for (i in 1:numtri) { x <- sample(c(0,1), numbet, prob=c(prob, 1-prob), replace = TRUE) xcum[i, ] <- c(i, cumsum(x)/cumsum(1:numbet)) } colnames(xcum) <- c("trial", paste("bet", 1:numbet, sep="")) mxcum <- reshape(data.frame(xcum), varying=1+1:numbet, idvar="trial", v.names="outcome", direction="long", timevar="bet")

How to determine probability of words?

亡梦爱人 提交于 2019-12-06 05:19:06
问题 I have two documents. Doc1 is in the below format: TOPIC: 0 5892.0 site 0.0371690427699 Internet 0.0261371350984 online 0.0229124236253 web 0.0218940936864 say 0.0159538357094 TOPIC: 1 12366.0 web 0.150331554262 site 0.0517548115801 say 0.0451237263464 Internet 0.0153647096879 online 0.0135856380398 ...and so on till Topic 99 in the same pattern. And Doc2 is in the format: 0 0.566667 0 0.0333333 0 0 0 0.133333 .......... and so on... There are totally 100 values each value for each topic. Now

Selecting nodes with probability proportional to trust

本小妞迷上赌 提交于 2019-12-06 04:30:49
Does anyone know of an algorithm or data structure relating to selecting items, with a probability of them being selected proportional to some attached value? In other words: http://en.wikipedia.org/wiki/Sampling_%28statistics%29#Probability_proportional_to_size_sampling The context here is a decentralized reputation system and the attached value is therefore the value of trust one user has in another. In this system all nodes either start as friends which are completely trusted or unknowns which are completely untrusted. This isn't useful by itself in a large P2P network because there will be

Generating random numbers with weighted probabilities in python

一个人想着一个人 提交于 2019-12-06 03:53:36
问题 Given a positive integer array a , the goal is to generate 5 random numbers based on the weight they have in the array. For example: a = [2,3,4,4,4,4,4,6,7,8,9] In this case the number 4 has appeared 5 times, in this case the number 4 should have the probability of 5/11 to appear. No numbers should be repeated. 回答1: Given a , an array of positive integers, you'll first need to compute the frequency of each integer. For example, using bincount : >>> a = [2,3,4,4,4,4,4,4,5,6,7,8,9,4,9,2,3,6,3,1

generating poisson variables in c++

守給你的承諾、 提交于 2019-12-06 03:27:27
问题 I implemented this function to generate a poisson random variable typedef long unsigned int luint; luint poisson(luint lambda) { double L = exp(-double(lambda)); luint k = 0; double p = 1; do { k++; p *= mrand.rand(); } while( p > L); return (k-1); } where mrand is the MersenneTwister random number generator. I find that, as I increase lambda, the expected distribution is going to be wrong, with a mean that saturates at around 750. Is it due to numerical approximations or did I make any

Select random row from MySQL (with probability)

倾然丶 夕夏残阳落幕 提交于 2019-12-06 01:44:32
问题 I have a MySQL table that has a row called cur_odds which is a percent number with the percent probability that that row will get selected. How do I make a query that will actually select the rows in approximately that frequency when you run through 100 queries for example? I tried the following, but a row that has a probability of 0.35 ends up getting selected around 60-70% of the time. SELECT * FROM table ORDER BY RAND()*cur_odds DESC All the values of cur_odds in the table add up to 1

How can I find out how many rows of a matrix satisfy a rather complicated criterion (in R)?

岁酱吖の 提交于 2019-12-05 22:00:43
As an example, here is a way to get a matrix of all possible outcomes of rolling 4 (fair) dice. z <- as.matrix(expand.grid(c(1:6),c(1:6),c(1:6),c(1:6))) As you may already have understood, I'm trying to work out a question that was closed , though, in my opinion, it's a challenging one. I used counting techniques to solve it (I mean by hand) and I finaly arrived to a number of outcomes, with a sum of subset being 5, equal to 1083 out of 1296. That result is consistent with the answers provided to that question, before it was closed. I was wondering how could that subset of outcomes (say z1,

Ruby - Picking an element in an array with 50% chance for a[0], 25% chance for a[1]

大城市里の小女人 提交于 2019-12-05 20:08:39
Nothing too complicated, basically I just want to pick an element from the array as if I were making coin tosses for each index and and choosing the index when I first get a head. Also no heads means I choose the last bin. I came up with the following and was wondering if there was a better/more efficient way of doing this. def coin_toss(size) random_number = rand(2**size) if random_number == 0 return size-1 else return (0..size-1).detect { |n| random_number[n] == 1 } end end First guess...pick a random number between 1 and 2**size , find the log base 2 of that, and pick the number that many

naive classifier matlab

好久不见. 提交于 2019-12-05 19:52:07
When testing the naive classifier in matlab I get different results even though I trained and tested on the same sample data, I was wondering if my code is correct and if someone could help explain why this is? %% dimensionality reduction columns = 6 [U,S,V]=svds(fulldata,columns); %% randomly select dataset rows = 1000; columns = 6; %# pick random rows indX = randperm( size(fulldata,1) ); indX = indX(1:rows)'; %# pick random columns %indY = randperm( size(fulldata,2) ); indY = indY(1:columns); %# filter data data = U(indX,indY); %% apply normalization method to every cell data = zscore(data);