tidyr

R: use tidyr to clean-up data table with structural missing and redundant data

余生颓废 提交于 2019-12-11 20:25:34
问题 Still trying to get my hands on tidyr packages. If one has a data set with redundant rows like this: require(dplyr) require(tidyr) data <- data.frame( v1 = c("ID1", NA, "ID2", NA), v2 = c("x", NA, "xx", NA), v3 = c(NA, "z", NA, "zz"), v4 = c(22, 22, 6, 6), v5 = c(5, 5, 9, 9)) %>% tbl_df() > data Source: local data frame [4 x 5] v1 v2 v3 v4 v5 1 ID1 x NA 22 5 2 NA NA z 22 5 3 ID2 xx NA 6 9 4 NA NA zz 6 9 Since the id variables v1 - v3 is split into redundant rows with many NAs (and therefore

Error in match.arg(p.adjust.method) : 'arg' must be NULL or a character vector

别等时光非礼了梦想. 提交于 2019-12-11 17:35:12
问题 Here my data mydat=structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), group = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), var = c(23L, 24L, 24L, 23L, 23L, 24L, 24L, 23L, 23L, 24L, 24L, 23L, 23L, 24L, 24L, 23L, 23L, 24L, 24L, 23L, 23L, 24L, 24L, 23L)), .Names = c("id", "group", "var" ), class = "data.frame", row.names = c(NA, -24L)) I want to join two tables. id is

Convert long to wide with variable number of columns

元气小坏坏 提交于 2019-12-11 17:29:50
问题 Given the following data in long form, I would like to create a wide dataset with one row for each srdr_id , and a separate column for each arm_name as below. Desired output srdr_id c1 c2 c3 174212 TAU MI MI 172612 TAU MI I've tried tidyr::spread() without success. dat <- structure(list(srdr_id = c("174212", "174212", "174212", "172612", "172612"), arm_name = c("TAU", "MI", "MI", "TAU", "MI")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L)) Following the first suggestion, I

group_by to select first two rows, then spread()

偶尔善良 提交于 2019-12-11 16:57:15
问题 I'm trying to reformat this so I can generate a dataframe of all instances of On Hold Begins and the next event immediately after it. On Hold Begins is the start an event, and I'd like to capture its Timestamp and Deviation as well as the Timestamp and Deviation for the next event immediately after it (i.e. Below Thresold , Stage Enabled ). If possible, I only want to include slices that have On Hold Begins as the first event (so the ideal solution would not include rows 1 &2 above), do not

could not find function “spread” [closed]

情到浓时终转凉″ 提交于 2019-12-11 16:35:13
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . So at the moment I am trying to figure out how to build a movie recommender system from MovieLense (https://grouplens.org/datasets/movielens/100k/). I read some instructions from a tutorial. library(dplyr) library(recommenderlab) library(magrittr) data <- read.table("u.data", header = F, stringsAsFactors = T)

pivot_wider based on condition of a 0 or 1

痴心易碎 提交于 2019-12-11 16:28:09
问题 I am trying to use pivot_wider on my data. The data looks like: dates yes_no 1 2017-01-01 0 2 2017-01-02 1 3 2017-01-03 0 4 2017-01-04 1 5 2017-01-05 1 Where I am trying to get the expected output to be: dates yes_no 2017-01-02_1 2017-01-04_1 2017-01-05_1 1 2017-01-01 0 0 0 0 2 2017-01-02 1 1 0 0 3 2017-01-03 0 0 0 0 4 2017-01-04 1 0 1 0 5 2017-01-05 1 0 0 1 Where the data has been spread when the yes_no column has a 1 in. This doesn't work for me: d %>% mutate(value_for_one_hot = 1) %>%

Rearrange dataframe by subsetting and column bind [duplicate]

白昼怎懂夜的黑 提交于 2019-12-11 16:13:55
问题 This question already has an answer here : Merging rows with the same ID variable [duplicate] (1 answer) Closed 3 years ago . I have the following dataframe: st <- data.frame( se = rep(1:2, 5), X = rnorm(10, 0, 1), Y = rnorm(10, 0, 2)) st$xy <- paste(st$X,",",st$Y) st <- st[c("se","xy")] but I want it to be the following: 1 2 3 4 5 -1.53697673029089 , 2.10652020463275 -1.02183940974772 , 0.623009466458354 1.33614674072657 , 1.5694345481646 0.270466789820086 , -0.75670874554064 -0

Using tidyr's gather_

﹥>﹥吖頭↗ 提交于 2019-12-11 15:35:54
问题 Probably an easy one: I'd like to use tidyr 's gather_ on this data.frame : set.seed(1) df <- data.frame(a=rnorm(10),b=rnorm(10),d=rnorm(10),id=paste0("id",1:10)) First, using gather : df %>% tidyr::gather(key=name,value=val,-id) Gives me the desired outcome. However, trying to match that with gather_ like this: df %>% tidyr::gather_(key_col="name",value_col="val",gather_cols="id") Doesn't give me what the gather usage does. Any idea? 回答1: I think you want: df %>% tidyr::gather_(key_col="name

R officer - Nest Dataframe Into Grouped List And Export Tables to Word With Group Headers

跟風遠走 提交于 2019-12-11 15:14:34
问题 I read a similar question that helped me get to the point I'm at now, but am struggling with the next step. I have a dataframe similar to below - Product = c("Apple", "Apple", "Banana", "Banana", "Banana", "Carrot", "Carrot") Category = c(1, 2, 1, 2, 3, 1, 2) Slope = c(2.988, 2.311, 2.181, 6.387, 2.615, 7.936, 3.267) df = data.frame(Product, Category, Slope) My objective is to have a Word report with a table for each product. To do this, I create a list with the data and flextables, as below

Reshape Data Frame Based on Corresponding Column's Identifier R

自作多情 提交于 2019-12-11 08:43:48
问题 I'm tried to reshape a two column data frame by collapse the corresponding column values that match in column 2 - in this case ticker symbols to their own unique row while making the contents of column 1 which are the fields of data that correspond to those tickers their own columns. See for example a small sample since it's a data frame with 500 tickers and 4 fields: test22 Ticker Current SharePrice $6.57 MFM Current NAV $7.11 MFM Current Premium/Discount -7.59% MFM 52WkAvg SharePrice $6.55