dplyr | 易学教程

Remove duplicate values across a few columns but keep rows

阅读更多关于 Remove duplicate values across a few columns but keep rows

问题 I have a dataframe that looks like this: dat <- data.frame(id=1:6, z_1=c(100,290,38,129,0,290), z_2=c(20,0,0,0,0,290), z_3=c(0,0,38,0,0,98), z_4=c(0,0,38,127,38,78), z_5=c(23,0,25,0,0,98), z_6=c(100,0,25,127,0,9)) dat id z_1 z_2 z_3 z_4 z_5 z_6 1 1 100 20 0 0 23 100 2 2 290 0 0 0 0 0 3 3 38 0 38 38 25 25 4 4 129 0 0 127 0 127 5 5 0 0 0 38 0 0 6 6 290 290 98 78 98 9 I want to remove duplicate values of z_x across each row, replacing any duplicates with either a 0 or NA , but leaving the rows &

Remove duplicate values across a few columns but keep rows

阅读更多关于 Remove duplicate values across a few columns but keep rows

R dplyr. Filter a dataframe that contains a column of numeric vectors

阅读更多关于 R dplyr. Filter a dataframe that contains a column of numeric vectors

问题 I have a dataframe in which one column contains numeric vectors. I want to filter rows based on a condition involving that column. This is a simplified example. df <- data.frame(id = LETTERS[1:3], name=c("Alice", "Bob", "Carol")) mylist=list(c(1,2,3), c(4,5), c(1,3,4)) df$numvecs <- mylist df # id name numvecs # 1 A Alice 1, 2, 3 # 2 B Bob 4, 5 # 3 C Carol 1, 3, 4 I can use something like mapply e.g. mapply(function(x,y) x=="B" & 4 %in% y, df$id, df$numvecs) which correctly returns TRUE for

Replace NA when last and next non-NA values are equal

阅读更多关于 Replace NA when last and next non-NA values are equal

问题 I have a sample table with some but not all NA values that need to be replaced. > dat id message index 1 1 <NA> 1 2 1 foo 2 3 1 foo 3 4 1 <NA> 4 5 1 foo 5 6 1 <NA> 6 7 2 <NA> 1 8 2 baz 2 9 2 <NA> 3 10 2 baz 4 11 2 baz 5 12 2 baz 6 13 3 bar 1 14 3 <NA> 2 15 3 <NA> 3 16 3 bar 4 17 3 <NA> 5 18 3 bar 6 19 3 <NA> 7 20 3 qux 8 My objective is to replace the NA values that are surrounded by the same "message" using the first appearance of the message (the least index value) and the last appearance

Combine select and mutate

阅读更多关于 Combine select and mutate

问题 Quite often, I find myself manually combining select() and mutate() functions within dplyr. This is usually because I'm tidying up a dataframe, want to create new columns based on the old columns, and only want keep the new columns. For example, if I had data about heights and widths but only wanted to use them to calculate and keep the area then I would use: library(dplyr) df <- data.frame(height = 1:3, width = 10:12) df %>% mutate(area = height * width) %>% select(area) When there are a lot

Combine select and mutate

阅读更多关于 Combine select and mutate

Combine select and mutate

阅读更多关于 Combine select and mutate

Combine select and mutate

阅读更多关于 Combine select and mutate

Why does dplyr error in this nested if_else, when logical condition means output should not be evaluated?

阅读更多关于 Why does dplyr error in this nested if_else, when logical condition means output should not be evaluated?

问题 I have a nested if_else statement inside mutate . In my example data frame: tmp_df2 <- data.frame(a = c(1,1,2), b = c(T,F,T), c = c(1,2,3)) a b c 1 1 TRUE 1 2 1 FALSE 2 3 2 TRUE 3 I wish to group by a and then perform operations based on whether a group has one or two rows. I would have thought this nested if_else would suffice: tmp_df2 %>% group_by(a) %>% mutate(tmp_check = n() == 1) %>% mutate(d = if_else(tmp_check, # check for number of entries in group 0, if_else(b, sum(c)/c[b == T], sum

Why does dplyr error in this nested if_else, when logical condition means output should not be evaluated?

阅读更多关于 Why does dplyr error in this nested if_else, when logical condition means output should not be evaluated?