Select cells randomly from NumPy array - without replacement

前端未结

关注

 6  1393

I\'m writing some modelling routines in NumPy that need to select cells randomly from a NumPy array and do some processing on them. All cells must be selected without replacemen

相关标签:

6条回答

挽巷

2021-02-19 16:53

Use random.sample to generates ints in 0 .. A.size with no duplicates, then split them to index pairs:

import random
import numpy as np

def randint2_nodup( nsample, A ):
    """ uniform int pairs, no dups:
        r = randint2_nodup( nsample, A )
        A[r]
        for jk in zip(*r):
            ... A[jk]
    """
    assert A.ndim == 2
    sample = np.array( random.sample( xrange( A.size ), nsample ))  # nodup ints
    return sample // A.shape[1], sample % A.shape[1]  # pairs


if __name__ == "__main__":
    import sys

    nsample = 8
    ncol = 5
    exec "\n".join( sys.argv[1:] )  # run this.py N= ...
    A = np.arange( 0, 2*ncol ).reshape((2,ncol))

    r = randint2_nodup( nsample, A )
    print "r:", r
    print "A[r]:", A[r]
    for jk in zip(*r):
        print jk, A[jk]

0 讨论(0)

谎友^

2021-02-19 17:01

people using numpy version 1.7 or later there can also use the builtin function numpy.random.choice

0 讨论(0)
发布评论:

提交评论
- 加载中...
-上瘾入骨i

2021-02-19 17:08
Let's say you have an array of data points of size 8x3
```
data = np.arange(50,74).reshape(8,-1)
```
If you truly want to sample, as you say, all the indices as 2d pairs, the most compact way to do this that i can think of, is:
```
#generate a permutation of data's size, coerced to data's shape
idxs = divmod(np.random.permutation(data.size),data.shape[1])

#iterate over it
for x,y in zip(*idxs): 
    #do something to data[x,y] here
    pass
```
Moe generally, though, one often does not need to access 2d arrays as 2d array simply to shuffle 'em, in which case one can be yet more compact. just make a 1d view onto the array and save yourself some index-wrangling.
```
flat_data = data.ravel()
flat_idxs = np.random.permutation(flat_data.size)
for i in flat_idxs:
    #do something to flat_data[i] here
    pass
```
This will still permute the 2d "original" array as you'd like. To see this, try:
```
 flat_data[12] = 1000000
 print data[4,0]
 #returns 1000000
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
甜味超标

2021-02-19 17:11
Extending the nice answer from @WoLpH

For a 2D array I think it will depend on what you want or need to know about the indices.

You could do something like this:
```
data = np.arange(25).reshape((5,5))

x, y  = np.where( a = a)
idx = zip(x,y)
np.random.shuffle(idx)
```
OR
```
data = np.arange(25).reshape((5,5))

grid = np.indices(data.shape)
idx = zip( grid[0].ravel(), grid[1].ravel() )
np.random.shuffle(idx)
```
You can then use the list idx to iterate over randomly ordered 2D array indices as you wish, and to get the values at that index out of the data which remains unchanged.

Note: You could also generate the randomly ordered indices via itertools.product too, in case you are more comfortable with this set of tools.
0 讨论(0)
发布评论:

提交评论
- 加载中...
温柔的废话

2021-02-19 17:15
How about using numpy.random.shuffle or numpy.random.permutation if you still need the original array?

If you need to change the array in-place than you can create an index array like this:
```
your_array = <some numpy array>
index_array = numpy.arange(your_array.size)
numpy.random.shuffle(index_array)

print your_array[index_array[:10]]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
耶瑟儿～

2021-02-19 17:18
All of these answers seemed a little convoluted to me.

I'm assuming that you have a multi-dimensional array from which you want to generate an exhaustive list of indices. You'd like these indices shuffled so you can then access each of the array elements in a randomly order.

The following code will do this in a simple and straight-forward manner:
```
#!/usr/bin/python
import numpy as np

#Define a two-dimensional array
#Use any number of dimensions, and dimensions of any size
d=numpy.zeros(30).reshape((5,6))

#Get a list of indices for an array of this shape
indices=list(np.ndindex(d.shape))

#Shuffle the indices in-place
np.random.shuffle(indices)

#Access array elements using the indices to do cool stuff
for i in indices:
  d[i]=5

print d
```
Printing d verified that all elements have been accessed.

Note that the array can have any number of dimensions and that the dimensions can be of any size.

The only downside to this approach is that if d is large, then indices may become pretty sizable. Therefore, it would be nice to have a generator. Sadly, I can't think of how to build a shuffled iterator off-handedly.
0 讨论(0)
发布评论:

提交评论
- 加载中...