tidyr

How to pair rows in a data frame with many columns using dplyr in R?

北战南征 提交于 2020-01-14 03:08:08
问题 I have a dataframe containing multiple observations from the control and the experimental cohorts with replicates for each subject. Here is an example of my dataframe: subject cohort replicate val1 val2 A control 1 10 0.1 A control 2 15 0.3 A experim 1 40 0.7 A experim 2 45 0.9 B control 1 5 0.3 B experim 1 30 0.0 C control 1 50 0.5 C experim 1 NA 1.0 I'd like to pair each control observation with its corresponding experimental one for each value to calculate the ratio between the pairs. The

How to gather then mutate a new column then spread to wide format again

纵饮孤独 提交于 2020-01-13 17:56:54
问题 Using tidyr/dplyr, I have some factor columns which I'd like to Z-score, and then mutate an average Z-score, whilst retaining the original data for reference. I'd like to avoid using a for loop in tidyr/dplyr, thus I'm gathering my data and performing my calculation (Z-score) on a single column. However, I'm struggling with restoring the wide format. Here is a MWE: library(dplyr) library(tidyr) # Original Data dfData <- data.frame( Name = c("Steve","Jwan","Ashley"), A = c(10,20,12), B = c(0.2

How to gather then mutate a new column then spread to wide format again

↘锁芯ラ 提交于 2020-01-13 17:55:32
问题 Using tidyr/dplyr, I have some factor columns which I'd like to Z-score, and then mutate an average Z-score, whilst retaining the original data for reference. I'd like to avoid using a for loop in tidyr/dplyr, thus I'm gathering my data and performing my calculation (Z-score) on a single column. However, I'm struggling with restoring the wide format. Here is a MWE: library(dplyr) library(tidyr) # Original Data dfData <- data.frame( Name = c("Steve","Jwan","Ashley"), A = c(10,20,12), B = c(0.2

Convert Rows into Columns by matching string in R

时间秒杀一切 提交于 2020-01-11 09:28:38
问题 I have number of rows in a list like ' [1,] "Home" [2,] "A" [3,] "B" [4,] "C" [5,] "Home" [6,] "D" [7,] "E" [8,] "Home" [9,] "F" [10,] "G" [11,] "H" [12,] "I" these rows are coming dynamically...after "Home" there can be two,three,four,five or more subcategories....so binding is not working... I have more than 5000 rows and "Home" is common in the start for every subcategories.. I Want it to look like this. [,1] [,2] [,3] [,4] [,5] [1,] "Home" "A" "B" "C" [2,] "Home" "D" "E" [3,] "Home" "F"

Spread vs dcast

醉酒当歌 提交于 2020-01-09 19:35:52
问题 I have a table like this, > head(dt2) Weight Height Fitted interval limit value 1 65.6 174.0 71.91200 pred lwr 53.73165 2 80.7 193.5 91.63237 pred lwr 73.33198 3 72.6 186.5 84.55326 pred lwr 66.31751 4 78.8 187.2 85.26117 pred lwr 67.02004 5 74.8 181.5 79.49675 pred lwr 61.29244 6 86.4 184.0 82.02501 pred lwr 63.80652 I want it to have like this, > head(reshape2::dcast(dt2, Weight + Height + Fitted + interval ~ limit, fun.aggregate = mean)) Weight Height Fitted interval lwr upr 1 42.0 153.4

Reshape Data Long to Wide - understanding reshape parameters

久未见 提交于 2020-01-09 05:32:25
问题 I have a long format dataframe dogs that I'm trying to reformat to wide using the reshape() function. It currently looks like so: dogid month year trainingtype home school timeincomp 12345 1 2014 1 1 1 340 12345 2 2014 1 1 1 360 31323 12 2015 2 7 3 440 31323 1 2014 1 7 3 500 31323 2 2014 1 7 3 520 The dogid column is a bunch of ids, one for each dog. The month column varies for 1 to 12 for the 12 months, and year from 2014 to 2015. Trainingtype varies for 1 to 2. Each dog has a timeincomp

Separating a column in R using Regex & separate (tidyr)

僤鯓⒐⒋嵵緔 提交于 2020-01-06 08:20:50
问题 This is what I am looking to be able to do. https://regex101.com/r/KchccA/1 I want to match on any characters in-between = and ) while also considering if there is a null captured group, as I want all fields to be populated per row. Example of a row: In this example Address4, County, and Contact name are null. You can also see how some have wrong / incorrect values. Theres also some initial / ending text too I need to remove. x <- "Please enter an UT location before booking the order..

How can you gather() multiple columns at the same time in dplyr (R)?

怎甘沉沦 提交于 2020-01-06 06:48:55
问题 I am trying to gather untidy data from wide to long format. I have 748 variables, that need to be condensed to approximately 30. In this post, I asked: how to tidy my wide data? The answer: use gather(). However, I am still struggling to gather multiple columns and was hoping you could pinpoint where I'm going wrong. Reproducible example: tb1 <- tribble(~x1,~x2,~x3,~y1,~y2,~y3, 1,NA,NA,NA,1,NA, NA,1,NA,NA,NA,1, NA,NA,1,NA,NA,1) # A tibble: 3 x 6 # x1 x2 x3 y1 y2 y3 # <dbl> <dbl> <dbl> <lgl>

R - Wrong error message - Error: Duplicate identifiers for rows [duplicate]

江枫思渺然 提交于 2020-01-06 06:16:12
问题 This question already has an answer here : How to spread columns with duplicate identifiers? (1 answer) Closed 2 years ago . I have a problem with a dataframe that I need to reshape. I have this command: library(tidyverse) df1 = df1 %>% gather(Day, value, Day01:Day31) %>% spread(Station, value) And I get this error: Error: Duplicate identifiers for rows (130933, 131029), (389113, 389209), (647293, 647389), (905473, 905569), (1163653, 1163749), (1421833, 1421929), (1680013, 1680109), (1938193,

Reformat and Collapse Data Frame Based on Corresponding Column Identifier Code R

别说谁变了你拦得住时间么 提交于 2020-01-06 06:09:07
问题 I'm trying to reshape a two column data frame by collapsing the corresponding column values that match in column 2 - in this case ticker symbols to their own unique row while making the contents of column 1 which are the fields of data that correspond to those tickers their own columns. See below for my example with a small sample since it's a data frame with 500 tickers and 4 fields: # Closed End Fund Selector url<-"https://www.cefconnect.com/api/v3/DailyPricing?props=Ticker,Name