SparklyR separate one Spark DataFrame column into two columns

前端 未结 2 565
悲&欢浪女
悲&欢浪女 2020-12-21 11:44

I have a dataframe containing a column named COL which is structured in this way:

VALUE1###VALUE2

The followin

2条回答
  •  别那么骄傲
    2020-12-21 11:50

    You can use ft_regex_tokenizer followed by sdf_separate_column.

    ft_regex_tokenizer will split a column into a vector type, based on a regex. sdf_separate_column will split this into multiple columns.

    mydf %>% 
        ft_regex_tokenizer(input_col="mycolumn", output_col="mycolumnSplit", pattern=";") %>% 
        sdf_separate_column("mycolumnSplit", into=c("column1", "column2")
    

    UPDATE: in recent versions of sparklyr, the parameters input.col and output.col have been renamed to input_col and output_col, respectively.

提交回复
热议问题