probability | 易学教程

Optimal Algorithm for Winning Hangman

阅读更多关于 Optimal Algorithm for Winning Hangman

问题 In the game Hangman, is it the case that a greedy letter-frequency algorithm is equivalent to a best-chance-of-winning algorithm? Is there ever a case where it's worth sacrificing preservation of your remaining lives, for the sake of a better chance of guessing the correct answer? Further clarification of the problem: The selected word to be guessed has been taken from a known dictionary. You are given N lives, and thus have to maximise the probability of guessing all the letters in the word

scikit-learn return value of LogisticRegression.predict_proba

阅读更多关于 scikit-learn return value of LogisticRegression.predict_proba

问题 What exactly does the LogisticRegression.predict_proba function return? In my example I get a result like this: [[ 4.65761066e-03 9.95342389e-01] [ 9.75851270e-01 2.41487300e-02] [ 9.99983374e-01 1.66258341e-05]] From other calculations, using the sigmoid function, I know, that the second column are probabilities. The documentation says, that the first column are n_samples , but that can't be, because my samples are reviews, which are texts and not numbers. The documentation also says, that

Returning a random value from array with probability proportional to it's value

阅读更多关于 Returning a random value from array with probability proportional to it's value

I have an array like $keywords = array('apple'=>10,'orange'=>2,'grape'=>12); I want to randomly pick one of the "Key" from the array. However the probability distribution should be such that probability of picking an element should be proportional to it's value. Add all values (10+2+12 is 24); get a random number in the range [0, 24), and pick the corresponding element depending on whether the number lies in [0, 10), [10, 12), or [12, 24). I'd do it like this: $probabilities = array('apple'=>50, 'orange'=>20, 'banana'=>10); function random_probability($probabilities) { $rand = rand(0, array

Uniform distribution from a fractal Perlin noise function in C#

阅读更多关于 Uniform distribution from a fractal Perlin noise function in C#

My Perlin noise function (which adds up 6 octaves of 3D simplex at 0.75 persistence) generates a 2D array array of double s. These numbers each come out normalized to [-1, 1], with mean at 0. I clamp them to avoid exceptions, which I think are due to floating-point accuracy issues, but I am fairly sure my scaling factor is good enough for restricting the noise output to exactly this neighborhood in the ideal case. Anyway, that's all details. The point is, here is a 256-by-256 array of noise: The histogram with a normal fit looks like this: Matlab's lillietest is a function which applies the

What is O value for naive random selection from finite set?

阅读更多关于 What is O value for naive random selection from finite set?

问题 This question on getting random values from a finite set got me thinking... It's fairly common for people to want to retrieve X unique values from a set of Y values. For example, I may want to deal a hand from a deck of cards. I want 5 cards, and I want them to all be unique. Now, I can do this naively, by picking a random card 5 times, and try again each time I get a duplicate, until I get 5 cards. This isn't so great, however, for large numbers of values from large sets. If I wanted 999,999

What is the probability of collision with a 6 digit random alphanumeric code?

阅读更多关于 What is the probability of collision with a 6 digit random alphanumeric code?

问题 I'm using the following perl code to generate random alphanumeric strings (uppercase letters and numbers, only) to use as unique identifiers for records in my MySQL database. The database is likely to stay under 1,000,000 rows, but the absolute realistic maximum would be around 3,000,000. Do I have a dangerous chance of 2 records having the same random code, or is it likely to happen an insignificantly small number of times? I know very little about probability (if that isn't already

Generate Random Boolean Probability

阅读更多关于 Generate Random Boolean Probability

问题 I only know how I can generate a random boolean value (true/false). The default probability is 50:50 But how can I generate a true false value with my own probability? Let's say it returns true with a probability of 40:60 or 20:80 etc... 回答1: Well, one way is Random.Next(100) <= 20 ? true : false , using the integer value of NextInt to force your own probability. I can't speak to the true 'randomness' of this method though. More detailed example: Random gen = new Random(); int prob = gen.Next

Probability of Outcomes Algorithm

阅读更多关于 Probability of Outcomes Algorithm

问题 I have a probability problem, which I need to simulate in a reasonable amount of time. In simplified form, I have 30 unfair coins each with a different known probability. I then want to ask things like "what is the probability that exactly 12 will be heads?", or "what is the probability that AT LEAST 5 will be tails?". I know basic probability theory, so I know I can enumerate all (30 choose x) possibilities, but that's not particularly scalable. The worst case (30 choose 15) has over 150

Generate random numbers distributed by Zipf

阅读更多关于 Generate random numbers distributed by Zipf

问题 The Zipf probability distribution is often used to model file size distribution or item access distributions on items in P2P systems. e.g. "Web Caching and Zip like Distribution Evidence and Implications", but neither Boost or the GSL (Gnu Scientific Library) provide an implementation to generate random numbers using this distribution. I have not found a (trustworthy) implementation using the common search engines. How can random numbers that are distributed according to the Zipf distribution

Estimating a probability given other probabilities from a prior

阅读更多关于 Estimating a probability given other probabilities from a prior

I have a bunch of data coming in (calls to an automated callcenter) about whether or not a person buys a particular product, 1 for buy, 0 for not buy. I want to use this data to create an estimated probability that a person will buy a particular product, but the problem is that I may need to do it with relatively little historical data about how many people bought/didn't buy that product. A friend recommended that with Bayesian probability you can "help" your probability estimate by coming up with a "prior probability distribution", essentially this is information about what you expect to see,