I\'ve just started using R and I\'m not sure how to incorporate my dataset with the following sample code:
sample(x, size, replace = FALSE, prob = NULL)
There is a very simple way to select a number of rows using the R index for rows and columns. This lets you CLEANLY split the data set given a number of rows - say the 1st 80% of your data.
In R all rows and columns are indexed so DataSetName[1,1] is the value assigned to the first column and first row of "DataSetName". I can select rows using [x,] and columns using [,x]
For example: If I have a data set conveniently named "data" with 100 rows I can view the first 80 rows using
View(data[1:80,])
In the same way I can select these rows and subset them using:
train = data[1:80,]
test = data[81:100,]
Now I have my data split into two parts without the possibility of resampling. Quick and easy.