tidyr

dplyr pivot table

主宰稳场 提交于 2019-12-06 07:25:38
I want to obtain a pivot table with descending value. library(dplyr) library(tidyr) h<-mtcars %>% group_by(cyl, gear) %>% tally() %>% spread(gear, n, fill = 0) h<-h%>% add_rownames("index") i<-mtcars %>% group_by(cyl, gear) %>% tally() %>% spread(cyl, n, fill = 0) To obtain the sum of the values j<-i%>% select(-1)%>% summarise_each(funs(sum)) k<-t(j) k<- as.data.frame(k) k<-tbl_df(k) k<-k%>%add_rownames("index") l<-left_join(h,k,by="index") l<-l%>% select(-1)%>% arrange(desc(V1)) Is there another way to do the same in dplyr? We group by 'cyl', 'gear', get the frequency count ( tally() ),

How to spread tbl_dbi and tbl_sql data without downloading to local memory

北城余情 提交于 2019-12-06 06:23:02
问题 I am working with large datasets and tidyr's spread usually gives me error messages suggesting failure to obtain memory to perform the operation. Therefore, I have been exploring dbplyr. However, as it says here, and also shown below, dbplyr::spread() does not work. My question here is whether there is another way to accomplish what tidyr::spread does while working with tbl_dbi and tbl_sql data without downloading to local memory. Using sample data from here, below I present what I get and

tidyr spread does not aggregate data

痴心易碎 提交于 2019-12-06 05:57:45
问题 I have data of the following: > data <- data.frame(unique=1:9, grouping=rep(c('a', 'b', 'c'), each=3), value=sample(1:30, 9)) > data unique grouping value 1 1 a 15 2 2 a 21 3 3 a 26 4 4 b 8 5 5 b 6 6 6 b 4 7 7 c 17 8 8 c 1 9 9 c 3 I would like to create a table that looks like this: a b c 1 15 8 17 2 21 6 1 3 26 6 3 I am using tidyr::spread and not getting the correct result: > data %>% spread(grouping, value) unique a b c 1 1 15 NA NA 2 2 21 NA NA 3 3 26 NA NA 4 4 NA 8 NA 5 5 NA 6 NA 6 6 NA

Separate variable in field by character

*爱你&永不变心* 提交于 2019-12-06 03:15:24
I recently asked this question Separate contents of field And got a very quick and very simple answer. Something I can do simply in Excel is look in a cell, find the first instance of a character and then return all the characters to the left of that. For example Author Drijgers RL, Verhey FR, Leentjens AF, Kahler S, Aalten P. I can extract Drijgers RL and Aalten P into separate columns in excel. This lets me count the number of times someone is a first author and also the last author. How can I replicate this in R? I can count the total number of times an author has a publication from the

How can tidyr spread function take variable as a select column

隐身守侯 提交于 2019-12-06 02:14:33
问题 tidyr's spread function only takes column names without quotes. Is there a way I can pass in a variable that contains the column name for eg # example using gather() library("tidyr") dummy.data <- data.frame("a" = letters[1:25], "B" = LETTERS[1:5], "x" = c(1:25)) dummy.data var = "x" dummy.data %>% gather(key, value, var) This gives an error Error: All select() inputs must resolve to integer column positions. The following do not: * var Which is solved using match function which gives the

Concatenating all rows within a group using dplyr

倾然丶 夕夏残阳落幕 提交于 2019-12-06 01:32:03
Suppose I have a dataframe like this: hand_id card_id card_name card_class A 1 p alpha A 2 q beta A 3 r theta B 2 q beta B 3 r theta B 4 s gamma C 1 p alpha C 2 q beta I would like to concatenate the card_id, card_name, and card_class into one single row per hand level A, B, C. So the result would look something like this: hand_id combo_1 combo_2 combo_3 A 1-2-3 p-q-r alpha-beta-theta B 2-3-4 q-r-s beta-theta-gamma .... I attempted to do this using group_by and mutate, but I can't seem to get it to work data <- read_csv('data.csv') byHand <- group_by(data, hand_id) %>% mutate(combo_1 = paste

How to ungroup list columns in data.table?

你说的曾经没有我的故事 提交于 2019-12-06 01:29:45
tidyr provides the unnest function that help expanding list columns. This is similar to the much (20x) faster ungroup function in kdb. I am looking for a similar (but much faster) function that, assuming a data.table that contains several list columns, each with the same number of element on each row, would expand the data.table. This an extension of this post . library(data.table) library(tidyr) t = Sys.time() DT = data.table(a=c(1,2,3), b=c('q','w','e'), c=list(rep(t,2),rep(t+1,3),rep(t,0)), d=list(rep(1,2),rep(20,3),rep(1,0))) print(DT) a b c d 1: 1 q 2016-01-09 09:55:14,2016-01-09 09:55:14

Separate string after last underscore

荒凉一梦 提交于 2019-12-06 01:27:39
This is indeed a duplicate for this question r-split-string-using-tidyrseparate , but I cannot use the MWE for my purpose, because I do not know how to adjust the regular Expression. I basically want the same thing, but split the variable after the last underscore. Reason: I have data where some columns show up several times for the same factor/type. I figured I can melt the data separate the value variable before the type string and spread it out again to a wide format with less columns. My Problem is that my variable names have different several underscores and I would like to learn how to

conditional string splitting in R (using tidyr)

浪尽此生 提交于 2019-12-05 23:43:49
I have a data frame like this: X <- data.frame(value = c(1,2,3,4), variable = c("cost", "cost", "reed_cost", "reed_cost")) I'd like to split the variable column into two; one column to indicate if the variable is a 'cost' and another column to indicate whether or not the variable is "reed". I cannot seem to figure out the right regex for the split (e.g. using tidyr) If my data were something nicer, say: Y <- data.frame(value = c(1,2,3,4), variable = c("adjusted_cost", "adjusted_cost", "reed_cost", "reed_cost")) Then this is trivial with tidyr: separate(Y, variable, c("Type", "Model"), "_") and

tidyr - unique way to get combinations (using tidyverse only)

…衆ロ難τιáo~ 提交于 2019-12-05 21:45:41
I wanted to get all unique pairwise combinations of a unique string column of a dataframe using the tidyverse (ideally). Here is a dummy example: library(tidyverse) a <- letters[1:3] %>% tibble::as_tibble() a #> # A tibble: 3 x 1 #> value #> <chr> #> 1 a #> 2 b #> 3 c tidyr::crossing(a, a) %>% magrittr::set_colnames(c("words1", "words2")) #> # A tibble: 9 x 2 #> words1 words2 #> <chr> <chr> #> 1 a a #> 2 a b #> 3 a c #> 4 b a #> 5 b b #> 6 b c #> 7 c a #> 8 c b #> 9 c c Is there a way to remove 'duplicate' combinations here. That is have the output be the following in this example: # A tibble: