tidyr | 易学教程

Preserve order of columns when going from wide to long format

阅读更多关于 Preserve order of columns when going from wide to long format

问题 I'm trying to preserve the order of columns when I gather them from wide to long format. The problem I'm having is after I gather and summarize the order is lost. The number of columns is huge so I don't want to manually type the order. Here's an example: library(tidyr) library(dplyr) N <- 4 df <- data.frame(sample = c(1,1,2,2), y1.1 = rnorm(N), y2.1 = rnorm(N), y10.1 = rnorm(N)) > df sample y1.1 y2.1 y10.1 1 1 1.040938 0.8851727 -0.3617224 2 1 1.175879 1.0009824 -1.1352406 3 2 -1.501832 0

How to specify multiple columns with gather() function to tidy data

阅读更多关于 How to specify multiple columns with gather() function to tidy data

问题 I want to tidy my data with the gather function but how do I specify multiple columns at once? Say this is my data: Country Country.Code Year X0tot4 X5tot9 X10tot14 X15tot19 X20tot24 1 Viet Nam 704 1955 4606 2924 2389 2340 2502 2 Viet Nam 704 1960 5842 4410 2860 2356 2318 3 Viet Nam 704 1965 6571 5646 4328 2823 2335 4 Viet Nam 704 1970 7065 6391 5548 4271 2797 5 Viet Nam 704 1975 7658 6862 6237 5437 4208 6 Viet Nam 704 1980 7991 7473 6754 6113 5266 7 Viet Nam 704 1985 8630 7855 7375 6657 6027

Convert categorical column to multiple binary columns [duplicate]

阅读更多关于 Convert categorical column to multiple binary columns [duplicate]

问题 This question already has answers here : Generate a dummy-variable (16 answers) Closed 2 years ago . I would like to convert this column into binary columns for each breed (1 dog is breed, 0 dog is not that breed) 回答1: One way could be using unique with a for-loop Breed = c( "Sheetland Sheepdog Mix", "Pit Bull Mix", "Lhasa Aposo/Miniature", "Cairn Terrier/Chihuahua Mix", "American Pitbull", "Cairn Terrier", "Pit Bull Mix" ) df=data.frame(Breed) for (i in unique(df$breed)){ df[,paste0(i)]

R: How to split a string into values and map the resultant broken pieces as columns to the dataset? [duplicate]

阅读更多关于 R: How to split a string into values and map the resultant broken pieces as columns to the dataset? [duplicate]

问题 This question already has answers here : Split a column of concatenated comma-delimited data and recode output as factors (2 answers) Closed 2 years ago . As shown in the above pic, I've a column, genres, with a list of genres the corresponding movie belongs to. There are in total 19 unique genres. I'd like to know if I can manipulate this data into appending 19 columns to the data set each corresponding to each of the genres identifiers and label the corresponding cells as 0 or 1 indicating

Copy column data when function unaggregates a single row into multiple in R

阅读更多关于 Copy column data when function unaggregates a single row into multiple in R

问题 I need help in taking an annual total (for each of many initiatives) and breaking that down to each month using a simple division formula. I need to do this for each distinct combination of a few columns while copying down the columns that are broken from annual to each monthly total. The loop will apply the formula to two columns and loop through each distinct group in a vector. I tried to explain in an example below as it's somewhat complex. What I have : | Init | Name | Date |Total Savings

filtering observations from time series conditionally by group

阅读更多关于 filtering observations from time series conditionally by group

问题 I have a df (“df”) containing multiple time series (value ~ time) whose observations are grouped by 3 factors: temp, rep, and species. These data need to be trimmed at the lower and upper ends of the time series, but these threshold values are group conditional (e.g. remove observations below 2 and above 10 where temp=10, rep=2, and species = “A”). I have an accompanying df (df_thresholds) that contains grouping values and the mins and maxs i want to use for each group. Not all groups need

R: How to pivot and count data.frame (ex: list of medical conditions and the number of patients with each)

阅读更多关于 R: How to pivot and count data.frame (ex: list of medical conditions and the number of patients with each)

问题 I'm trying to get better with dplyr and tidyr but I'm not used to "thinking in R". An example may be best. The table I've generated from my data in sql looks like this: ╔═══════════╦════════════╦═════╦════════╦══════════════╦══════════╦══════════════╗ ║ patientid ║ had_stroke ║ age ║ gender ║ hypertension ║ diabetes ║ estrogen HRT ║ ╠═══════════╬════════════╬═════╬════════╬══════════════╬══════════╬══════════════╣ ║ 934988 ║ 1 ║ 65 ║ M ║ 1 ║ 1 ║ 0 ║ ║ 94044 ║ 0 ║ 69 ║ F ║ 1 ║ 0 ║ 0 ║ ║ 689348

Extract emotions calculation for every row of a dataframe

阅读更多关于 Extract emotions calculation for every row of a dataframe

问题 I have a dataframe with rows of text. I would like to extract for each row of text a vector of specific emotion which will be a binary 0 is not exist this emotion or 1 is exist. Totally they are 5 emotions but I would like to have the 1 only for the emotion which seem to be the most. Example of what I have tried: library(tidytext) text = data.frame(id = c(11,12,13), text=c("bad movie","good movie","I think it would benefit religious people to see things like this, not just to learn about our

Tidying in R: how to collapse my binary columns into characters, based on vectors?

阅读更多关于 Tidying in R: how to collapse my binary columns into characters, based on vectors?

问题 I am tidying my data in R, and want to turn multiple columns into 1, using a function iterating over the items of a vector. I was wondering whether you could help me out to: work away a semantic error, and make my code more efficient? My data is based on a survey with 32 questions. Each question has multiple answers. Each answer is a column, with options 1 and NA. For one question, a section of the dataset can be reproduced as follows: XV2_1 <- c(1,NA,NA,NA) XV2_2 <- c(NA,1,NA,NA) XV2_3 <- c

Fill count/sum based on previous row count over time series

阅读更多关于 Fill count/sum based on previous row count over time series

问题 I have performed counts of events (in Group 1) over a time period for each group (in Group 2). I am looking to spread Group 1 events into separate columns, and using Group 2 and timestamp as rows. Each cell will contain the counts of events over a time period (Present date to the previous 4 days). See the example below, for each of the Group 2 (I & II) I counted Events A and L in Group 1 happened within 4 days. dates = as.Date(c("2011-10-09", "2011-10-15", "2011-10-16", "2011-10-18", "2011-10