Directly creating dummy variable set in a sparse matrix in R

前端 未结 2 1397
清酒与你
清酒与你 2020-11-29 10:30

Suppose you have a data frame with a high number of columns(1000 factors, each with 15 levels). You\'d like to create a dummy variable data set, but since it would be too sp

2条回答
  •  庸人自扰
    2020-11-29 11:08

    This can be done slightly more compactly with Matrix:::sparse.model.matrix, although the requirement to have all columns for all variables makes things a little more difficult.

    Generate input:

    set.seed(123)
    n <- 6
    df <- data.frame(x = sample(c("A", "B", "C"), n, TRUE),
                     y = sample(c("D", "E"),      n, TRUE))
    

    If you didn't need all columns for all variables you could just do:

    library(Matrix)
    sparse.model.matrix(~.-1,data=df)
    

    If you need all columns:

    fList <- lapply(names(df),reformulate,intercept=FALSE)
    mList <- lapply(fList,sparse.model.matrix,data=df)
    do.call(cBind,mList)
    

提交回复
热议问题