dcast for huge dataframe [R]

断了今生、忘了曾经 提交于 2019-12-08 10:50:21

问题


Assume a DF of:

    pnr <- c(1, 1, 1, 2, 2, 3, 4, 5, 5)
    diag <- c("a", "a", NA, "b", "a", NA, "c", "a", "f")
    year <- rep(2007, 9)
    ht <- data.frame(pnr, diag, year)

Now I need to reshape such that:

    require(reshape2)
    md <- melt(ht, id = c("pnr", "year"))
    output <- dcast(md, pnr ~ value)

Output is now in the format I want. But when I run this on a large data frame, 13million rows, it will crash R-studio. Is there some smart way to split a dataframe, do the dcast, and tie back?

EDIT : The solutions posted below, will not work in this case, as I not able to install. Surely there is some way to work around this?


回答1:


The easy solution to this case turned out to be switching back to the old reshape package. Which means useing cast instead of dcast. Arun's comments are highly usable, providede one can actually update. Related



来源:https://stackoverflow.com/questions/28986836/dcast-for-huge-dataframe-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!