Shuffle a data frame while maintaining order with another data frame

我怕爱的太早我们不能终老 提交于 2021-02-08 05:55:23

问题


I have 2 data frames train and label. The data frame train has 784 rows and 20K columns. The data frame label has 1 row and 20K columns. Each i column in label corresponds to i column in train. train is something like:

---->--- 20K Columns ---->
  0  0  0  0  ...  3
  1  0  .  .  ...  .   
  4  0
  9  7
  .  .
  .  .
  .  .
  1  4

So for each i column where i belongs to {1,20K} there is a corresponding label in the label data frame which is something like:

---->----20K columns----->
0 -1 3 4 5 8 0 -5 -9 1 2 ....

The first column in train corresponds to the first column in label, second column in train corresponds to the second column in label and so on.

Now, I want to shuffle the train data frame. But if I shuffle train, the order with label will get lost. Is there a way where I could shuffle train data frame while maintaining order with label?


回答1:


Shuffle an ordering vector, and use that to order both objects.

shuffle <- sample(ncol(label))
label <- label[,shuffle]
train <- train[,shuffle]

An example with mtcars:

#create the label data frame
label <- data.frame(as.list(names(mtcars)), stringsAsFactors = FALSE)
label
#    X.mpg. X.cyl. X.disp. X.hp. X.drat. X.wt. X.qsec. X.vs. X.am. X.gear. X.carb.
# 1    mpg    cyl    disp    hp    drat    wt    qsec    vs    am    gear    carb

shuffle <- sample(ncol(label))
mtcars <- mtcars[,shuffle]
label <- label[,shuffle]
label
#    X.carb. X.wt. X.hp. X.cyl. X.mpg. X.gear. X.vs. X.am. X.drat. X.disp. X.qsec.
# 1    carb    wt    hp    cyl    mpg    gear    vs    am    drat    disp    qsec

head(mtcars)
#                   carb    wt  hp cyl  mpg gear vs am drat disp  qsec
# Mazda RX4            4 2.620 110   6 21.0    4  0  1 3.90  160 16.46
# Mazda RX4 Wag        4 2.875 110   6 21.0    4  0  1 3.90  160 17.02
# Datsun 710           1 2.320  93   4 22.8    4  1  1 3.85  108 18.61
# Hornet 4 Drive       1 3.215 110   6 21.4    3  1  0 3.08  258 19.44
# Hornet Sportabout    2 3.440 175   8 18.7    3  0  0 3.15  360 17.02
# Valiant              1 3.460 105   6 18.1    3  1  0 2.76  225 20.22

A more direct approach would be to rbind the two data frames, but I assumed you have them as separate objects for a reason.



来源:https://stackoverflow.com/questions/49584310/shuffle-a-data-frame-while-maintaining-order-with-another-data-frame

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!