tidyr

Why doesn't gather() use the key variable name?

狂风中的少年 提交于 2019-12-02 08:12:19
It's shameful, but I still can't wrap my mind fully around tidyr , specifically gather() . I feel like I'm missing something fundamental. If I run this tiny snippet of code library(tidyr) x <- data.frame(var1=letters[1:3], var2=LETTERS[7:9], var3=21:23) gather(x, foo, value) I get > x var1 var2 var3 1 a G 21 2 b H 22 3 c I 23 > gather(x, foo, value) variable value 1 var1 a 2 var1 b 3 var1 c 4 var2 G 5 var2 H 6 var2 I 7 var3 21 8 var3 22 9 var3 23 Where does foo get used? Is this completely unnecessary? Am I tripping up because I'm thinking reshape style where you define the ID variables and

Tidyverse gather with rowdata from other data frame

守給你的承諾、 提交于 2019-12-02 08:07:57
问题 I have been searching for quite some time to an elegant solution to this problem, to no avail. So I decided to give it a go here. I am using tidyverse , and the gather function to convert a matrix containing intensity values from different samples into long format in preparation for plotting with ggplot. There are two types of annotation. 'Row-based' annotation of the data, corresponding to genes, and 'column-based' annotation corresponding to sample information. The column based information

Using Group_by create aggregated counts conditional on value

不羁岁月 提交于 2019-12-02 07:37:40
问题 I have a data table that looks like this: serialno state type type2 1 100 FL A C 2 100 CA A D 3 101 CA B D 4 102 GA A C 5 103 WA A C 6 103 PA B C 7 104 CA B D 8 104 CA B C 9 105 NY A D 10 105 NJ B C I need to create a new data table that is aggregated by serialno but calculates the count of each type of existing variables. So the end result would look like this. FL CA GA A B C D 100 1 1 2 1 1 101 1 1 1 1 102 1 1 103 1 1 1 1 2 104 2 2 1 1 105 1 1 1 1 1 1 I'm sure there is a solution using some

R: Converting wide format to long format with multiple 3 time period variables [duplicate]

心不动则不痛 提交于 2019-12-02 07:24:19
问题 This question already has answers here : Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed last year . Apologies if this is a simple question, but I haven't been able to find a simple solution after searching. I'm fairly new to R, and am having trouble converting wide format to long format using either the melt (reshape2) or gather(tidyr) functions. The dataset that I'm working with contains 22 different time variables that are

how do I gather 2 sets of columns in tidyr [duplicate]

怎甘沉沦 提交于 2019-12-02 06:43:37
This question already has an answer here: Combine Multiple Columns Into Tidy Data [duplicate] 3 answers Reshaping multiple sets of measurement columns (wide format) into single columns (long format) 7 answers I have the following structure: key | category_x | 2009 | category_y | 2010 test example data as requested set.seed(24) df <- data.frame( key = 1:10, category_x = paste0("stock_", 0:9), '2008' = rnorm(10, 0, 10), category_y = paste0("stock_", 0:9), '2009' = rnorm(10, 0, 10), category_z = paste0("stock_", 0:9), '2010' = rnorm(10, 0, 10), check.names=FALSE ) how do I change that into: key |

Stop gather function from dropping factor labels

心已入冬 提交于 2019-12-02 05:47:16
问题 I'm trying to use the gather function in tidyr - but it is stripping out the labels from factored data. My data looks something like this: > require(tidyr) > messy = data.frame(x=rep(seq(0,2),2),y=runif(6),z=runif(6),source=c('good','bad')) > messy x y z source 1 0 0.37627685 0.9108316 good 2 1 0.77593147 0.9944256 bad 3 2 0.01105364 0.1183923 good 4 0 0.37755463 0.6761343 bad 5 1 0.86333114 0.7312482 good 6 2 0.69085345 0.8288506 bad >tidy = gather(messy,coordinate,value,y:z) >tidy x source

Tidy up and reshape messy dataset (reshape/gather/unite function)?

最后都变了- 提交于 2019-12-02 05:42:33
Following my earlier question: R: reshape/gather function to create dataset ready for multilevel analysis I discovered it is a bit more complicated. My dataset is actually 'messier' than I hoped. So here's the full story: I have a big dataset, 240 cases. Each row is a case (breast cancer patient). Somewhere at the end of the dataset(say from column 417 onwards) I have partner data of the patients, that also filled in a questionnaire. In the beginning, there are demographic variables for both patients and partners, followed by test outcomes only of patients, thus followed by partner data. I

Reshape messy longitudinal survey data containing multiple different variables, wide to long

血红的双手。 提交于 2019-12-02 05:05:28
I hope that I'm not recreating the wheel, and do not think that the following can be answered using reshape . I have messy longitudinal survey data, that I want to convert from wide to long format. By messy I mean: I have a mixture of variable types (numeric, factor, logical) Not all variables have been collected at every timepoint. For example: data <- read.table(header=T, text=' id inlove.1 inlove.2 income.2 income.3 mood.1 mood.3 random 1 TRUE FALSE 87717.76 82281.25 happy happy filler 2 TRUE TRUE 70795.53 54995.19 so-so happy filler 3 FALSE FALSE 48012.77 47650.47 sad so-so filler ') I

Joining the result of two statistical tables in one table in R

别说谁变了你拦得住时间么 提交于 2019-12-02 04:54:22
In continuation of this issue comparison Mann-Whitney test between groups , I decided to create a new topic. Solution of Rui Barradas helped me calculate Mann-Whitney for group 1-2 and 1-3. lst <- split(mydat, mydat$group) lapply(lst[-1], function(DF) wilcox.test(DF$var, lst[[1]]$var, exact = FALSE)) So now i want get the descriptive statistics. I use library:psych describeBy(mydat$var,mydat$group) So i get the following output group: 1 vars n mean sd median trimmed mad min max range skew kurtosis se X1 1 4 23.5 0.58 23.5 23.5 0.74 23 24 1 0 -2.44 0.29 -----------------------------------------

R: Converting wide format to long format with multiple 3 time period variables [duplicate]

那年仲夏 提交于 2019-12-02 04:14:54
This question already has an answer here: Reshaping multiple sets of measurement columns (wide format) into single columns (long format) 7 answers Apologies if this is a simple question, but I haven't been able to find a simple solution after searching. I'm fairly new to R, and am having trouble converting wide format to long format using either the melt (reshape2) or gather(tidyr) functions. The dataset that I'm working with contains 22 different time variables that are each 3 time periods. The problem occurs when I try to convert all of these from wide to long format at once. I have had