I\'ve just started using R and I\'m not sure how to incorporate my dataset with the following sample code:
sample(x, size, replace = FALSE, prob = NULL)
After looking through all the different methods posted here, I didn't see anyone utilize TRUE/FALSE to select and unselect data. So I thought I would share a method utilizing that technique.
n = nrow(dataset)
split = sample(c(TRUE, FALSE), n, replace=TRUE, prob=c(0.75, 0.25))
training = dataset[split, ]
testing = dataset[!split, ]
There are multiple ways of selecting data from R, most commonly people use positive/negative indices to select/unselect respectively. However, the same functionalities can be achieved by using TRUE/FALSE to select/unselect.
Consider the following example.
# let's explore ways to select every other element
data = c(1, 2, 3, 4, 5)
# using positive indices to select wanted elements
data[c(1, 3, 5)]
[1] 1 3 5
# using negative indices to remove unwanted elements
data[c(-2, -4)]
[1] 1 3 5
# using booleans to select wanted elements
data[c(TRUE, FALSE, TRUE, FALSE, TRUE)]
[1] 1 3 5
# R recycles the TRUE/FALSE vector if it is not the correct dimension
data[c(TRUE, FALSE)]
[1] 1 3 5