tidyr | 易学教程

How to pair rows in a data frame with many columns using dplyr in R?

阅读更多关于 How to pair rows in a data frame with many columns using dplyr in R?

问题 I have a dataframe containing multiple observations from the control and the experimental cohorts with replicates for each subject. Here is an example of my dataframe: subject cohort replicate val1 val2 A control 1 10 0.1 A control 2 15 0.3 A experim 1 40 0.7 A experim 2 45 0.9 B control 1 5 0.3 B experim 1 30 0.0 C control 1 50 0.5 C experim 1 NA 1.0 I'd like to pair each control observation with its corresponding experimental one for each value to calculate the ratio between the pairs. The

How to gather then mutate a new column then spread to wide format again

阅读更多关于 How to gather then mutate a new column then spread to wide format again

问题 Using tidyr/dplyr, I have some factor columns which I'd like to Z-score, and then mutate an average Z-score, whilst retaining the original data for reference. I'd like to avoid using a for loop in tidyr/dplyr, thus I'm gathering my data and performing my calculation (Z-score) on a single column. However, I'm struggling with restoring the wide format. Here is a MWE: library(dplyr) library(tidyr) # Original Data dfData <- data.frame( Name = c("Steve","Jwan","Ashley"), A = c(10,20,12), B = c(0.2

How to gather then mutate a new column then spread to wide format again

阅读更多关于 How to gather then mutate a new column then spread to wide format again

Convert Rows into Columns by matching string in R

阅读更多关于 Convert Rows into Columns by matching string in R

问题 I have number of rows in a list like ' [1,] "Home" [2,] "A" [3,] "B" [4,] "C" [5,] "Home" [6,] "D" [7,] "E" [8,] "Home" [9,] "F" [10,] "G" [11,] "H" [12,] "I" these rows are coming dynamically...after "Home" there can be two,three,four,five or more subcategories....so binding is not working... I have more than 5000 rows and "Home" is common in the start for every subcategories.. I Want it to look like this. [,1] [,2] [,3] [,4] [,5] [1,] "Home" "A" "B" "C" [2,] "Home" "D" "E" [3,] "Home" "F"

Spread vs dcast

阅读更多关于 Spread vs dcast

问题 I have a table like this, > head(dt2) Weight Height Fitted interval limit value 1 65.6 174.0 71.91200 pred lwr 53.73165 2 80.7 193.5 91.63237 pred lwr 73.33198 3 72.6 186.5 84.55326 pred lwr 66.31751 4 78.8 187.2 85.26117 pred lwr 67.02004 5 74.8 181.5 79.49675 pred lwr 61.29244 6 86.4 184.0 82.02501 pred lwr 63.80652 I want it to have like this, > head(reshape2::dcast(dt2, Weight + Height + Fitted + interval ~ limit, fun.aggregate = mean)) Weight Height Fitted interval lwr upr 1 42.0 153.4

Reshape Data Long to Wide - understanding reshape parameters

阅读更多关于 Reshape Data Long to Wide - understanding reshape parameters

问题 I have a long format dataframe dogs that I'm trying to reformat to wide using the reshape() function. It currently looks like so: dogid month year trainingtype home school timeincomp 12345 1 2014 1 1 1 340 12345 2 2014 1 1 1 360 31323 12 2015 2 7 3 440 31323 1 2014 1 7 3 500 31323 2 2014 1 7 3 520 The dogid column is a bunch of ids, one for each dog. The month column varies for 1 to 12 for the 12 months, and year from 2014 to 2015. Trainingtype varies for 1 to 2. Each dog has a timeincomp

Separating a column in R using Regex & separate (tidyr)

阅读更多关于 Separating a column in R using Regex & separate (tidyr)

问题 This is what I am looking to be able to do. https://regex101.com/r/KchccA/1 I want to match on any characters in-between = and ) while also considering if there is a null captured group, as I want all fields to be populated per row. Example of a row: In this example Address4, County, and Contact name are null. You can also see how some have wrong / incorrect values. Theres also some initial / ending text too I need to remove. x <- "Please enter an UT location before booking the order..

How can you gather() multiple columns at the same time in dplyr (R)?

阅读更多关于 How can you gather() multiple columns at the same time in dplyr (R)?

问题 I am trying to gather untidy data from wide to long format. I have 748 variables, that need to be condensed to approximately 30. In this post, I asked: how to tidy my wide data? The answer: use gather(). However, I am still struggling to gather multiple columns and was hoping you could pinpoint where I'm going wrong. Reproducible example: tb1 <- tribble(~x1,~x2,~x3,~y1,~y2,~y3, 1,NA,NA,NA,1,NA, NA,1,NA,NA,NA,1, NA,NA,1,NA,NA,1) # A tibble: 3 x 6 # x1 x2 x3 y1 y2 y3 # <dbl> <dbl> <dbl> <lgl>

R - Wrong error message - Error: Duplicate identifiers for rows [duplicate]

阅读更多关于 R - Wrong error message - Error: Duplicate identifiers for rows [duplicate]

问题 This question already has an answer here : How to spread columns with duplicate identifiers? (1 answer) Closed 2 years ago . I have a problem with a dataframe that I need to reshape. I have this command: library(tidyverse) df1 = df1 %>% gather(Day, value, Day01:Day31) %>% spread(Station, value) And I get this error: Error: Duplicate identifiers for rows (130933, 131029), (389113, 389209), (647293, 647389), (905473, 905569), (1163653, 1163749), (1421833, 1421929), (1680013, 1680109), (1938193,

Reformat and Collapse Data Frame Based on Corresponding Column Identifier Code R

阅读更多关于 Reformat and Collapse Data Frame Based on Corresponding Column Identifier Code R

问题 I'm trying to reshape a two column data frame by collapsing the corresponding column values that match in column 2 - in this case ticker symbols to their own unique row while making the contents of column 1 which are the fields of data that correspond to those tickers their own columns. See below for my example with a small sample since it's a data frame with 500 tickers and 4 fields: # Closed End Fund Selector url<-"https://www.cefconnect.com/api/v3/DailyPricing?props=Ticker,Name