numpy-random

Randomly selecting from Pandas groups with equal probability — unexpected behavior

这一生的挚爱 提交于 2021-02-10 04:57:02
问题 I have 12 unique groups that I am trying to randomly sample from, each with a different number of observations. I want to randomly sample from the entire population (dataframe) with each group having the same probability of being selected from. The simplest example of this would be a dataframe with 2 groups. groups probability 0 a 0.25 1 a 0.25 2 b 0.5 using np.random.choice(df['groups'], p=df['probability'], size=100) Each iteration will now have a 50% chance of selecting group a and a 50%

Randomly selecting from Pandas groups with equal probability — unexpected behavior

好久不见. 提交于 2021-02-10 04:45:42
问题 I have 12 unique groups that I am trying to randomly sample from, each with a different number of observations. I want to randomly sample from the entire population (dataframe) with each group having the same probability of being selected from. The simplest example of this would be a dataframe with 2 groups. groups probability 0 a 0.25 1 a 0.25 2 b 0.5 using np.random.choice(df['groups'], p=df['probability'], size=100) Each iteration will now have a 50% chance of selecting group a and a 50%

Split a list into n randomly sized chunks

不羁岁月 提交于 2021-01-28 21:46:03
问题 I am trying to split a list into n sublists where the size of each sublist is random (with at least one entry; assume P>I ). I used numpy.split function which works fine but does not satisfy my randomness condition. You may ask which distribution the randomness should follow. I think, it should not matter. I checked several posts which were not equivalent to my post as they were trying to split with almost equally sized chunks. If duplicate, let me know. Here is my approach: import numpy as

What is the difference between numpy.random's Generator class and np.random methods?

徘徊边缘 提交于 2020-12-23 02:51:50
问题 I have been using numpy's random functionality for a while, by calling methods such as np.random.choice() or np.random.randint() etc. I just now found about the ability to create a default_rng object, or other Generator objects: from numpy.random import default_rng gen = default_rng() random_number = gen.integers(10) So far I would have always used np.random.randint(10) instead, and I am wondering what the difference between both ways is. The only benefit I can think of would be keeping track

What is the difference between numpy.random's Generator class and np.random methods?

一个人想着一个人 提交于 2020-12-23 02:48:12
问题 I have been using numpy's random functionality for a while, by calling methods such as np.random.choice() or np.random.randint() etc. I just now found about the ability to create a default_rng object, or other Generator objects: from numpy.random import default_rng gen = default_rng() random_number = gen.integers(10) So far I would have always used np.random.randint(10) instead, and I am wondering what the difference between both ways is. The only benefit I can think of would be keeping track

What is the difference between numpy.random's Generator class and np.random methods?

情到浓时终转凉″ 提交于 2020-12-23 02:47:33
问题 I have been using numpy's random functionality for a while, by calling methods such as np.random.choice() or np.random.randint() etc. I just now found about the ability to create a default_rng object, or other Generator objects: from numpy.random import default_rng gen = default_rng() random_number = gen.integers(10) So far I would have always used np.random.randint(10) instead, and I am wondering what the difference between both ways is. The only benefit I can think of would be keeping track

1D Wasserstein distance in Python

痴心易碎 提交于 2020-12-13 05:25:50
问题 The formula below is a special case of the Wasserstein distance/optimal transport when the source and target distributions, x and y (also called marginal distributions) are 1D, that is, are vectors. where F^{-1} are inverse probability distribution functions of the cumulative distributions of the marginals u and v , derived from real data called x and y , both generated from the normal distribution: import numpy as np from numpy.random import randn import scipy.stats as ss n = 100 x = randn(n

1D Wasserstein distance in Python

久未见 提交于 2020-12-13 05:24:03
问题 The formula below is a special case of the Wasserstein distance/optimal transport when the source and target distributions, x and y (also called marginal distributions) are 1D, that is, are vectors. where F^{-1} are inverse probability distribution functions of the cumulative distributions of the marginals u and v , derived from real data called x and y , both generated from the normal distribution: import numpy as np from numpy.random import randn import scipy.stats as ss n = 100 x = randn(n

1D Wasserstein distance in Python

十年热恋 提交于 2020-12-13 05:23:13
问题 The formula below is a special case of the Wasserstein distance/optimal transport when the source and target distributions, x and y (also called marginal distributions) are 1D, that is, are vectors. where F^{-1} are inverse probability distribution functions of the cumulative distributions of the marginals u and v , derived from real data called x and y , both generated from the normal distribution: import numpy as np from numpy.random import randn import scipy.stats as ss n = 100 x = randn(n

Why does numpy.random.Generator.choice provides different results (seeded) with given uniform distribution compared to default uniform distribution?

前提是你 提交于 2020-07-10 10:27:05
问题 Simple test code: pop = numpy.arange(20) rng = numpy.random.default_rng(1) rng.choice(pop,p=numpy.repeat(1/len(pop),len(pop))) # yields 10 rng = numpy.random.default_rng(1) rng.choice(pop) # yields 9 The numpy documentation says: The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a. I don't know of any other way to create a uniform distribution, but numpy.repeat(1/len(pop),len(pop)) . Is numpy using something else? Why