How to randomize (or permute) a dataframe rowwise and columnwise?

前端 未结 8 1009
Happy的楠姐
Happy的楠姐 2020-11-28 22:50

I have a dataframe (df1) like this.

     f1   f2   f3   f4   f5
d1   1    0    1    1    1  
d2   1    0    0    1    0
d3   0    0    0    1    1
d4   0             


        
8条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-11-28 23:28

    Take a look at permatswap() in the vegan package. Here is an example maintaining both row and column totals, but you can relax that and fix only one of the row or column sums.

    mat <- matrix(c(1,1,0,0,0,0,0,1,1,0,0,0,1,1,1,0,1,0,1,1), ncol = 5)
    set.seed(4)
    out <- permatswap(mat, times = 99, burnin = 20000, thin = 500, mtype = "prab")
    

    This gives:

    R> out$perm[[1]]
         [,1] [,2] [,3] [,4] [,5]
    [1,]    1    0    1    1    1
    [2,]    0    1    0    1    0
    [3,]    0    0    0    1    1
    [4,]    1    0    0    0    1
    R> out$perm[[2]]
         [,1] [,2] [,3] [,4] [,5]
    [1,]    1    1    0    1    1
    [2,]    0    0    0    1    1
    [3,]    1    0    0    1    0
    [4,]    0    0    1    0    1
    

    To explain the call:

    out <- permatswap(mat, times = 99, burnin = 20000, thin = 500, mtype = "prab")
    
    1. times is the number of randomised matrices you want, here 99
    2. burnin is the number of swaps made before we start taking random samples. This allows the matrix from which we sample to be quite random before we start taking each of our randomised matrices
    3. thin says only take a random draw every thin swaps
    4. mtype = "prab" says treat the matrix as presence/absence, i.e. binary 0/1 data.

    A couple of things to note, this doesn't guarantee that any column or row has been randomised, but if burnin is long enough there should be a good chance of that having happened. Also, you could draw more random matrices than you need and discard ones that don't match all your requirements.

    Your requirement to have different numbers of changes per row, also isn't covered here. Again you could sample more matrices than you want and then discard the ones that don't meet this requirement also.

提交回复
热议问题