R SparkR - equivalent to melt function

两盒软妹~` 提交于 2019-12-22 00:13:16

问题


Is there a function similar to melt in SparkR library?

Transform data with 1 row and 50 columns to 50 rows and 3 columns?


回答1:


There is no built-in function that provides a similar functionality in SparkR. You can built your own with explode

library(magrittr)

df <- createDataFrame(data.frame(
  A = c('a', 'b', 'c'),
  B = c(1, 3, 5),
  C = c(2, 4, 6)
))

melt <- function(df, id.vars, measure.vars, 
                 variable.name = "key", value.name = "value") {

   measure.vars.exploded <- purrr::map(
       measure.vars, function(c) list(lit(c), column(c))) %>% 
     purrr::flatten() %>% 
     (function(x) do.call(create_map, x)) %>% 
     explode()
   id.vars <- id.vars %>% purrr::map(column)

   do.call(select, c(df, id.vars, measure.vars.exploded)) %>%
     withColumnRenamed("key", variable.name) %>%
     withColumnRenamed("value", value.name)
}

melt(df, c("A"), c("B", "C")) %>% head()
  A key value                                                                   
1 a   B     1
2 a   C     2
3 b   B     3
4 b   C     4
5 c   B     5
6 c   C     6

or use SQL with Hive's stack UDF:

stack <- function(df, id.vars, measure.vars, 
                  variable.name = "key", value.name = "value") { 
  measure.vars.exploded <- glue::glue('"{measure.vars}", `{measure.vars}`') %>%  
    glue::glue_collapse(" , ") %>%
    (function(x) glue::glue(
      "stack({length(measure.vars)}, {x}) as ({variable.name}, {value.name})"
    )) %>%
    as.character()
    do.call(selectExpr, c(df, id.vars, measure.vars.exploded))
}

stack(df, c("A"), c("B", "C")) %>% head()
  A key value
1 a   B     1
2 a   C     2
3 b   B     3
4 b   C     4
5 c   B     5
6 c   C     6

Related questions:

  • Gather in sparklyr
  • How to melt Spark DataFrame?


来源:https://stackoverflow.com/questions/52782554/r-sparkr-equivalent-to-melt-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!