How to create 2d array with numpy random.choice for every rows?

后端 未结 4 1855
我寻月下人不归
我寻月下人不归 2020-11-30 15:30

I\'m trying to create a 2d array (which is a six column and lots of rows) with numpy random choice with unique values between 1 and 50 for every row not all of the array

4条回答
  •  南笙
    南笙 (楼主)
    2020-11-30 16:17

    Here's a vectorized approach with rand+argsort/argpartition trick from here -

    np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
    

    Sample run -

    In [41]: rows = 10
    
    In [42]: np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
    Out[42]: 
    array([[ 1,  9,  3, 26, 14, 44],
           [32, 20, 27, 13, 25, 45],
           [40, 12, 47, 16, 10, 29],
           [ 6, 36, 32, 16, 18,  4],
           [42, 46, 24,  9,  1, 31],
           [15, 25, 47, 42, 34, 24],
           [ 7, 16, 49, 31, 40, 20],
           [28, 17, 47, 36,  8, 44],
           [ 7, 42, 14,  4, 17, 35],
           [39, 19, 37,  7,  8, 36]])
    

    Just to prove the random-ness -

    In [56]: rows = 1000000
    
    In [57]: out = np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
    
    In [58]: np.bincount(out.ravel())[1:]
    Out[58]: 
    array([120048, 120026, 119942, 119838, 119885, 119669, 119965, 119491,
           120280, 120108, 120293, 119399, 119917, 119974, 120195, 119796,
           119887, 119505, 120235, 119857, 119499, 120560, 119891, 119693,
           120081, 120369, 120011, 119714, 120218, 120581, 120111, 119867,
           119791, 120265, 120457, 120048, 119813, 119702, 120266, 120445,
           120016, 120190, 119576, 119737, 120153, 120215, 120144, 120196,
           120218, 119863])
    

    Timings on one million rows of data -

    In [43]: rows = 1000000
    
    In [44]: %timeit np.random.rand(rows, 50).argpartition(6,axis=1)[:,:6]+1
    1 loop, best of 3: 1.07 s per loop
    

提交回复
热议问题