How to write the remaining data frame in R after randomly subseting the data

问题

I took a random sample from a data frame. But I don't know how to get the remaining data frame.

df <- data.frame(x=rep(1:3,each=2),y=6:1,z=letters[1:6])

#select 3 random rows
df[sample(nrow(df),3)]

What I want is to get the remaining data frame with the other 3 rows.

回答1:

sample sets a random seed each time you run it, thus if you want to reproduce its results you will either need to set.seed or save its results in a variable.

Addressing your question, you simply need to add - before your index in order to get the rest of the data set. Also, don't forget to add a comma after the indx if you want to select rows (unlike in your question)

set.seed(1)
indx <- sample(nrow(df), 3)

Your subset

df[indx, ] 
#   x y z
# 2 1 5 b
# 6 3 1 f
# 3 2 4 c

Remaining data set

df[-indx, ]
#   x y z
# 1 1 6 a
# 4 2 3 d
# 5 3 2 e

回答2:

Try:

> df
  x y z
1 1 6 a
2 1 5 b
3 2 4 c
4 2 3 d
5 3 2 e
6 3 1 f
> 
> df2 = df[sample(nrow(df),3),]
> df2
  x y z
5 3 2 e
3 2 4 c
1 1 6 a

> df[!rownames(df) %in% rownames(df2),]
  x y z
1 1 6 a
2 1 5 b
5 3 2 e

来源：https://stackoverflow.com/questions/26881622/how-to-write-the-remaining-data-frame-in-r-after-randomly-subseting-the-data

标签

subset

sampling

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!