R: How can I extract an element from a column of data in spark connection (sparklyr) in pipe

心已入冬 提交于 2019-12-02 03:04:18

Although this isn't the most elegant string of code, it should get the job done. Since no sample dataset is provided other than a screenshot, I just created a sample with the important elements you were interested in.

csj <- tibble(helpful = rep(c("[0,0]","[0,1]","[0,2]","[1,3]"),100),
                            overall = rep(c(5,4,3,2),100))
#this change the columns and creates the help column
csj %>%
      mutate(col1 = as.numeric(stringi::stri_extract_first_regex(helpful, pattern = "[0-9]")),#extract first number
             col2 = as.numeric(stringi::stri_extract_last_regex(helpful, pattern = "[0-9]")),#extract second
             col3 = ifelse(col2 == 0, 1, row2 ),#change 0s to 1
             help = col1/col3) %>% #divide row1 and 3
      select(helpful, help)#select the rows you wish to keep

This should work as long as you modify the functions to your dataset as needed. Also note that helpful is a character type in your dataset which is why you need to change it to numeric

EDIT: So I looked up some sparklyr and realized why the code isn't working so I created an example for myself to test out.Although I didn't replicate your data completely I came up with enough things to hopefully provide a working solution.

library(sparklyr)
library(dplyr)
library(ggplot2)
library(magrittr) 
sc <- spark_connect(master="local")
#create dataframe
cjs <- tibble(helpful = rep(c("[0,  0]","[0, 1]","[0, 2]","[1, 3]","[,1]",NA,"a"),100),
              overall = rep(c(6,5,4,3,2,1,0),100))

#transfer to sparkly
csj <- copy_to(sc, csj,"cjs")

#this should do the trick
csj %>% 
  mutate(newcol2 = regexp_replace(helpful, "[^0-9,]", " "), 
         newcol3 = as.numeric(substring_index(newcol2, ",", 1)),
         newcol4 = as.numeric(substring_index(newcol2,",",-1)),
         newcol5 = ifelse(newcol4 == 0, 1, newcol4),
         help = newcol3/newcol5) %>% 
  select(starts_with("new"),help) #select the columns you need with help calculated appropriately
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!