How to split data into training/testing sets using sample function

前端 未结 24 1805
猫巷女王i
猫巷女王i 2020-11-22 10:43

I\'ve just started using R and I\'m not sure how to incorporate my dataset with the following sample code:

sample(x, size, replace = FALSE, prob = NULL)
         


        
24条回答
  •  悲&欢浪女
    2020-11-22 10:49

    Below a function that create a list of sub-samples of the same size which is not exactly what you wanted but might prove usefull for others. In my case to create multiple classification trees on smaller samples to test overfitting :

    df_split <- function (df, number){
      sizedf      <- length(df[,1])
      bound       <- sizedf/number
      list        <- list() 
      for (i in 1:number){
        list[i] <- list(df[((i*bound+1)-bound):(i*bound),])
      }
      return(list)
    }
    

    Example :

    x <- matrix(c(1:10), ncol=1)
    x
    # [,1]
    # [1,]    1
    # [2,]    2
    # [3,]    3
    # [4,]    4
    # [5,]    5
    # [6,]    6
    # [7,]    7
    # [8,]    8
    # [9,]    9
    #[10,]   10
    
    x.split <- df_split(x,5)
    x.split
    # [[1]]
    # [1] 1 2
    
    # [[2]]
    # [1] 3 4
    
    # [[3]]
    # [1] 5 6
    
    # [[4]]
    # [1] 7 8
    
    # [[5]]
    # [1] 9 10
    

提交回复
热议问题