How to adapt datasplit sizes with createDataPartition()

给你一囗甜甜゛ 提交于 2019-12-11 04:49:08

问题


I have a question concerning datasplitting into train, test & validation with createDataPartition(). I found a solution that fits perfectly for a 60, 20, 20 split. However, I don't see a way to adapt my data splitting with it and still ensure that my data is not overlapping. I.e., I would like to split into 80, 10, 10 or whatever.

    library("caret")
    # Draw a random, stratified sample including p percent of the data    
    idx.train <- createDataPartition(y = iris$Species, p = 0.8, list = FALSE) 
    # training set with p = 0.8
    train <- iris[idx.train, ] 
    # test set with p = 0.2 (drop all observations with train indeces)
    test <-  iris[-idx.train, ] 
    # Draw a random, stratified sample of ratio p of the data
    idx.validation <- createDataPartition(y = train$Species, p = 0.25, list = FALSE) 
    #validation set with p = 0.8*0.25 = 0.2
    validation <- train[idx.validation, ] 
    #final train set with p= 0.8*0.75 = 0.6
    train60 <- train[-idx.validation, ] 

来源:https://stackoverflow.com/questions/41880453/how-to-adapt-datasplit-sizes-with-createdatapartition

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!