How to split data into training/testing sets using sample function

前端 未结 24 1808
猫巷女王i
猫巷女王i 2020-11-22 10:43

I\'ve just started using R and I\'m not sure how to incorporate my dataset with the following sample code:

sample(x, size, replace = FALSE, prob = NULL)
         


        
24条回答
  •  独厮守ぢ
    2020-11-22 10:54

    There is a very simple way to select a number of rows using the R index for rows and columns. This lets you CLEANLY split the data set given a number of rows - say the 1st 80% of your data.

    In R all rows and columns are indexed so DataSetName[1,1] is the value assigned to the first column and first row of "DataSetName". I can select rows using [x,] and columns using [,x]

    For example: If I have a data set conveniently named "data" with 100 rows I can view the first 80 rows using

    View(data[1:80,])

    In the same way I can select these rows and subset them using:

    train = data[1:80,]

    test = data[81:100,]

    Now I have my data split into two parts without the possibility of resampling. Quick and easy.

提交回复
热议问题