mutate

R: How can I extract an element from a column of data in spark connection (sparklyr) in pipe

主宰稳场 提交于 2019-12-04 05:31:45
问题 I have a dataset as below. Because of its large amount of data, I uploaded it through the sparklyr package, so I can use only pipe statements. pos <- str_sub(csj$helpful,2) neg1 <- str_sub(csj$helpful,4) csj <- csj %>% mutate(neg=replace(helpful,stringr::str_sub(csj$helpful,4)==1,0)) csj <- csj %>% mutate(help=pos/neg) csj is.null(csj$helpful) I want to make a column named 'help' which is 'the first number of helpful column/2nd number of helpful column'. If the 2nd number is 0, I need to

Using case_when within mutate_at

若如初见. 提交于 2019-12-03 20:16:09
问题 I would like to use case_when within mutate_at , as in the following example: mtcars %>% mutate_at(.vars = vars(vs, am), .funs = funs(case_when( . %in% c(1,0,9) ~ TRUE . %in% c(2,20,200) ~ FALSE TRUE ~ as.character(.) ))) alternative version using . = in funs() call also does not work. mtcars %>% mutate_at(.vars = vars(vs, am), .funs = funs(. = case_when( . %in% c(1, 0, 9) ~ TRUE . %in% c(2, 20, 200) ~ FALSE TRUE ~ as.character(.) ))) Desired results mtcars %>% mutate_at(.vars = vars(vs, am),

use dplyr mutate() in programming

冷暖自知 提交于 2019-12-03 07:57:01
I am trying to assign a column name to a variable using mutate. df <-data.frame(x = sample(1:100, 50), y = rnorm(50)) new <- function(name){ df%>%mutate(name = ifelse(x <50, "small", "big")) } When I run new(name = "newVar") it doesn't work. I know mutate_() could help but I'm struggling in using it together with ifelse . Any help would be appreciated. Using dplyr 0.7.1 and its advances in NSE, you have to UQ the argument to mutate and then use := when assigning. There is lots of info on programming with dplyr and NSE here: https://cran.r-project.org/web/packages/dplyr/vignettes/programming

Sorting the values of column in ascending order in R

别来无恙 提交于 2019-12-02 08:20:44
The script below is a data frame of four columns. My need is that I want to take a pair of values(a1,a2) at a time. The column "a3" is such that if you check a pair say (a1,a2), as you span the data, the pair's value is arranged in ascending order. If there is a duplicate of the pair present in the table, I want the "a4" column values to be arranged just like the corresponding "a3" column in ascending order for the particular (a1,a2) value. Say the first (a1,a2) pair ("A","D"), the pair appears thrice and the corresponding a3 values are in asecending order. Similarly I wish to arrange the a4

How to add the results of applying a function to an existing data frame?

可紊 提交于 2019-12-02 08:06:59
问题 I am trying to calculate the confidence intervals of some rates. I am using tidyverse and epitools to calculate CI from Byar's method. I am almost certainly doing something wrong. library (tidyverse) library (epitools) # here's my made up data DISEASE = c("Marco Polio","Marco Polio","Marco Polio","Marco Polio","Marco Polio", "Mumps","Mumps","Mumps","Mumps","Mumps", "Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox") YEAR = c(2011, 2012, 2013, 2014, 2015, 2011, 2012, 2013, 2014,

How to add the results of applying a function to an existing data frame?

允我心安 提交于 2019-12-02 05:53:22
I am trying to calculate the confidence intervals of some rates. I am using tidyverse and epitools to calculate CI from Byar's method. I am almost certainly doing something wrong. library (tidyverse) library (epitools) # here's my made up data DISEASE = c("Marco Polio","Marco Polio","Marco Polio","Marco Polio","Marco Polio", "Mumps","Mumps","Mumps","Mumps","Mumps", "Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox","Chicky Pox") YEAR = c(2011, 2012, 2013, 2014, 2015, 2011, 2012, 2013, 2014, 2015, 2011, 2012, 2013, 2014, 2015) VALUE = c(82,89,79,51,51, 79,91,69,89,78, 71,69,95,61,87) AREA =c(

R: How can I extract an element from a column of data in spark connection (sparklyr) in pipe

心已入冬 提交于 2019-12-02 03:04:18
I have a dataset as below. Because of its large amount of data, I uploaded it through the sparklyr package, so I can use only pipe statements. pos <- str_sub(csj$helpful,2) neg1 <- str_sub(csj$helpful,4) csj <- csj %>% mutate(neg=replace(helpful,stringr::str_sub(csj$helpful,4)==1,0)) csj <- csj %>% mutate(help=pos/neg) csj is.null(csj$helpful) I want to make a column named 'help' which is 'the first number of helpful column/2nd number of helpful column'. If the 2nd number is 0, I need to change the 2nd number to 1 and then divide it. The data frame name is csj . But it doesn't work. I'll be

use dplyr mutate to create new columns based on a vector of column names

不打扰是莪最后的温柔 提交于 2019-12-01 23:40:22
问题 I would like to take the log of some columns, and create new columns that are all named log[original column name]. The code below works, but how can I pass the vector called columnstolog into mutate? Thank you. library(dplyr) data(mtcars) columnstolog <- c('mpg', 'cyl', 'disp', 'hp') mtcars %>% mutate(logmpg = log(mpg)) mtcars %>% mutate(logcyl = log(cyl)) 回答1: Use mutate_at , if you can bear with _log being appended to the original column names: mtcars %>% mutate_at(columnstolog, funs(log =

use dplyr mutate to create new columns based on a vector of column names

99封情书 提交于 2019-12-01 21:36:07
I would like to take the log of some columns, and create new columns that are all named log[original column name]. The code below works, but how can I pass the vector called columnstolog into mutate? Thank you. library(dplyr) data(mtcars) columnstolog <- c('mpg', 'cyl', 'disp', 'hp') mtcars %>% mutate(logmpg = log(mpg)) mtcars %>% mutate(logcyl = log(cyl)) Use mutate_at , if you can bear with _log being appended to the original column names: mtcars %>% mutate_at(columnstolog, funs(log = log(.))) # mpg cyl disp hp drat wt qsec vs am gear carb mpg_log cyl_log disp_log hp_log #1 21.0 6 160.0 110

Using case_when within mutate_at

我们两清 提交于 2019-11-30 15:14:54
I would like to use case_when within mutate_at , as in the following example: mtcars %>% mutate_at(.vars = vars(vs, am), .funs = funs(case_when( . %in% c(1,0,9) ~ TRUE . %in% c(2,20,200) ~ FALSE TRUE ~ as.character(.) ))) alternative version using . = in funs() call also does not work. mtcars %>% mutate_at(.vars = vars(vs, am), .funs = funs(. = case_when( . %in% c(1, 0, 9) ~ TRUE . %in% c(2, 20, 200) ~ FALSE TRUE ~ as.character(.) ))) Desired results mtcars %>% mutate_at(.vars = vars(vs, am), .funs = funs(ifelse(. %in% c(1, 0, 9), TRUE, FALSE))) FALSE could be replaced with second ifelse()