tidyr

Complete column with group_by and complete

耗尽温柔 提交于 2019-12-07 04:57:00
问题 I've got a little problem using dplyr group_by function. After doing this : datasetALL %>% group_by(YEAR,Region) %>% summarise(count_number = n()) here is the result : YEAR Region count_number <int> <int> <int> 1 1946 1 2 2 1946 2 3 3 1946 3 1 4 1946 5 1 5 1947 3 1 6 1947 4 1 I would like something like : YEAR Region count_number <int> <int> <int> 1 1946 1 2 2 1946 2 3 3 1946 3 1 4 1946 5 1 5 1946 4 0 #order is no important 6 1947 1 0 7 1947 2 0 8 1947 3 1 9 1947 4 1 10 1947 5 0 I try to use

gather with tidyr: position must be between 0 and n error

你说的曾经没有我的故事 提交于 2019-12-07 02:50:37
问题 I have some data like below: x.row10 <- setNames(data.frame(letters[1:3],1:3,2:4,3:5,4:6,5:7,6:8,7:9), c("names",2004:2009,2012)) # names 2004 2005 2006 2007 2008 2009 2012 #1 a 1 2 3 4 5 6 7 #2 b 2 3 4 5 6 7 8 #3 c 3 4 5 6 7 8 9 Now I can make them long with gather() from the tidyr package by writing: x.row10 %>% gather(Year, Val, -names) But when I use x.row10 %>% gather(Year, Val, c(2004:2009,2012)) which is my intuitive choice, I get the error message Error: Position must be between 0 and

Fill missing values in data.frame using dplyr complete within groups

我与影子孤独终老i 提交于 2019-12-07 01:58:11
问题 I'm trying to fill missing values in my dataframe, but I do not want all possible combinations of variables - I only want to fill based on a grouping of three variables: coursecode, year, and week. I've looked into complete() in tidyr library but I can't get it to work, even after looking at Using tidyr::complete with group_by and https://blog.rstudio.org/2015/09/13/tidyr-0-3-0/ I have observers that collect data on given weeks of the year at different courses. For example, data might be

Opposite of unnest_tokens

☆樱花仙子☆ 提交于 2019-12-07 01:26:42
问题 This is most likely a stupid question, but I've googled and googled and can't find a solution. I think it's because I don't know the right way to word my question to search. I have a data frame that I have converted to tidy text format in R to get rid of stop words. I would now like to 'untidy' that data frame back to its original format. What's the opposite / inverse command of unnest_tokens? Edit: here is what the data I'm working with look like. I'm trying to replicate analyses from Silge

how to create categories conditionally using other variables values and sequence

╄→尐↘猪︶ㄣ 提交于 2019-12-06 22:24:59
I would appreciate any help to create a function that allows me to create categories of one variable using the order of a set of other variables values. Specifically, I want a function that: creates category E1 of the variable variable the first time that each combination of values of the variables A , B , and ID appears in the dataset. creates category E2 of the variable variable the second time that each combination of values of the variables A , B , and ID appears in the dataset. creates category E3 of the variable variable the third time that each combination of values of the variables A ,

How to pair rows in a data frame with many columns using dplyr in R?

ε祈祈猫儿з 提交于 2019-12-06 15:40:20
I have a dataframe containing multiple observations from the control and the experimental cohorts with replicates for each subject. Here is an example of my dataframe: subject cohort replicate val1 val2 A control 1 10 0.1 A control 2 15 0.3 A experim 1 40 0.7 A experim 2 45 0.9 B control 1 5 0.3 B experim 1 30 0.0 C control 1 50 0.5 C experim 1 NA 1.0 I'd like to pair each control observation with its corresponding experimental one for each value to calculate the ratio between the pairs. The desired output would look something like this: subject replicate ratio_val1 ratio_val2 A 1 4 7 A 2 3 3

dplyr - sum of multiple columns using regular expressions

二次信任 提交于 2019-12-06 14:55:39
For the dataset mtcars2 mtcars2 = mtcars mtcars2 = mtcars2 %>% mutate(cyl9=cyl, disp9=disp, gear2=gear) I want to get a new column which is the sum of multiple columns, by using regular expressions to capture the pattern. This is a solution, however this is done by hard-coding select(mtcars2, cyl9) + select(mtcars2, disp9) + select(mtcars2, gear2) I tried something like this but it gives me a number instead of a vector mtcars2 %>% select(matches("[0-9]")) %>% sum Please dplyr solutions only, since i need to apply these functions to a sql table later on. Thanks! Update.. I need the solution to

Using gather from tidyr changes my regression results

夙愿已清 提交于 2019-12-06 10:01:23
问题 When I run the code below, everything works as expected # install.packages("dynlm") # install.packages("tidyr") require(dynlm) require(tidyr) Time <- 1950:1993 Y <- c(5820, 5843, 5917, 6054, 6099, 6365, 6440, 6465, 6449, 6658, 6698, 6740, 6931, 7089, 7384, 7703, 8005, 8163, 8506, 8737, 8842, 9022, 9425, 9752, 9602, 9711, 10121, 10425, 10744, 10876, 10746, 10770, 10782, 11179, 11617, 12015, 12336, 12568, 12903, 13029, 13093, 12899, 13110, 13391) X <- c(6284, 6390, 6476, 6640, 6628, 6879, 7080,

extracting values from column using tidyr

时光毁灭记忆、已成空白 提交于 2019-12-06 08:36:44
问题 I have data.frame annot defined as: annot <- structure(list(Name = c("dd_1", "dd_2", "dd_3","dd_4", "dd_5", "dd_6","dd_7"), GOs = c("C:extracellular space; C:cell body; P:cell migration process; P:NF/ß pathway", "C:Signal transduction; C:nucleus; F:positive regulation; P:single organism; P:positive(+) regulation", "C:cardiomyceltes; C:intracellular pace; F:putative; F:magnesium ion binding; F:calcium ion binding; P:visual perception; P:blood coagulation", "F:poly(A) RNA binding; P:DNA

Unable to use tidyselect `everything()` in combination with `group_by()` and `fill()`

陌路散爱 提交于 2019-12-06 08:03:42
library(tidyverse) df <- tibble(x1 = c("A", "A", "A", "B", "B", "B"), x2 = c(NA, 8, NA, NA, NA, 5), x3 = c(3, 6, 5, 9, 1, 9)) #> # A tibble: 6 x 3 #> x1 x2 x3 #> <chr> <dbl> <dbl> #> 1 A NA 3 #> 2 A 8 NA #> 3 A NA 5 #> 4 B NA 9 #> 5 B NA 1 #> 6 B 5 9 I have groups 'A' and 'B' shown in column x1 . I need the 'NA' values in columns x2 and x3 to populate only from values within the same group, in the updown direction. That's simple enough, here's the code: df %>% group_by(x1) %>% fill(c(x2, x3), .direction = "updown") #> # A tibble: 6 x 3 #> x1 x2 x3 #> <chr> <dbl> <dbl> #> 1 A 8 3 #> 2 A 8 5 #>