tidyr

“Warning: too many (few) values” for using tidyr packages in R [duplicate]

冷暖自知 提交于 2019-12-24 09:18:30
问题 This question already has answers here : How do I deal with special characters like \^$.?*|+()[{ in my regex? (2 answers) Closed 2 years ago . I have the following data set D7 name sex_age eye_color height 1 J M.34 Other 61 2 A F.55 Blue 59 3 T M.76 Brown 51 4 D F.19 Other 57 I want to separate the column sex_age into sex column and age column, so I type separate(D7,sex_age,c('sex','age'),sep='.') But it generates name sex age eye_color height 1 J Other 61 2 A Blue 59 3 T Brown 51 4 D Other

Break summed row into individual rows in R

匆匆过客 提交于 2019-12-24 08:56:33
问题 I have a set of data where I have annual totals for specific values stored in one row (observation). I'd like to tidy the data in R so that this total row is broken out for each month using a simple equation (total/12) that stores the annual total divided by 12 in each of 12 rows as a monthly total. I'm trying to do this in R but am very beginner and not quite sure where to start. Example is below: Date | Total 2015 | 12,000 Some R function to convert to: Date | Total 01-01-2015 | 1,000 02-01

Why does complete() create duplicate rows in my data?

陌路散爱 提交于 2019-12-24 08:38:11
问题 When I use the complete() function to fill in rows in my data that have no cases I find it is creating many duplicate rows as well. These can be removed with the unique() function, but I want to understand how I can avoid generating all these extra rows in the first place. library(dplyr) library(tidyr) # An incomplete table mtcars %>% group_by(vs, cyl) %>% count() # complete() creates a table with many duplicate rows temp <- mtcars %>% group_by(vs, cyl) %>% count() %>% complete(vs = c(0, 1),

Arithmetic operation based on value from another column

偶尔善良 提交于 2019-12-24 07:16:39
问题 I have a dataframe with a value column for multiple year. The years might not follow a sequence and might have a missing 5th year. Here is an example dataframe df = data.frame(code = c("AFG", "AGO", "ALB", "AND", "ARB", "ARE", "ARG", "ARM", "ASM", "ATG", "AUS", "AUT", "AUT", "AUT", "AUT", "ABW", "AFG", "AGO", "ALB", "AND", "ARB", "ARE", "ARG", "ARM", "ARM"), PPT = c(123, 42, 23, 5, 42, 4, 23, 25, 42, 23, NA, 5563, 56, 54, 645, 6, 4,53, 656, 65, 5563, 646, 6, 66, 54), Year = c(1990, 1991, 1992

replacing the nth character in a string only if it is a particular character in R

我只是一个虾纸丫 提交于 2019-12-24 06:31:23
问题 I am importing a series of surveys as .csv files and combining into one data set. The problem is for one of the seven files some of the variables are importing slightly differently. The data set is huge and I would like to find a way to write a function to run over dataset that is giving me trouble. In some of the variables there is an underscore when there should be a dot. Not all variables are of the same format but the ones that are incorrect are, in that the underscore is always the 6th

pivot_wider issue “Values in `values_from` are not uniquely identified; output will contain list-cols”

扶醉桌前 提交于 2019-12-24 03:43:28
问题 My data looks like this: # A tibble: 6 x 4 name val time x1 <chr> <dbl> <date> <dbl> 1 C Farolillo 7 2016-04-20 51.5 2 C Farolillo 3 2016-04-21 56.3 3 C Farolillo 7 2016-04-22 56.3 4 C Farolillo 13 2016-04-23 57.9 5 C Farolillo 7 2016-04-24 58.7 6 C Farolillo 9 2016-04-25 59.0 I am trying to use the pivot_wider function to expand out the data based on the name column. I use the following code: yy <- d %>% pivot_wider(., names_from = name, values_from = val) Which gives me the following

rearrange data: convert from water year to calendar year

别来无恙 提交于 2019-12-24 01:14:22
问题 I have a table with data from an stream gauge arranged as this: Water.Year May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr 1 1953-1954 55.55 43.62 30.46 26.17 26.76 41.74 19.92 41.25 28.77 20.96 12.47 10.51 2 1954-1955 23.49 81.35 46.71 29.33 67.83 133.30 37.62 30.16 21.07 19.38 13.87 10.63 3 1955-1956 9.87 51.59 55.36 63.03 154.08 98.15 104.06 32.85 22.89 17.30 15.68 10.88 > data <- structure(list(Water.Year = structure(1:6, .Label = c("1953-1954", "1954-1955", "1955-1956", "1956-1957",

dplyr and tidyr: convert long to wide format and arrange columns

人走茶凉 提交于 2019-12-24 00:57:38
问题 I'm creating a shiny app in which the user will upload a .csv file that contains several variables. Using dplyr , I will select the first four variables, shown below, and convert them from long format. DATA df <- read.table(text = c(" Customer Rate Factor Power W1 6 TK1 5 W2 3 TK1 0 W3 1 TK1 0 W4 2 TK1 0 W5 4 TK1 0 W6 8 TK1 0 W7 5 TK1 0 W8 7 TK1 3 W1 6 TK2 0 W2 3 TK2 1 W3 1 TK2 0 W4 2 TK2 5 W5 4 TK2 0 W6 8 TK2 0 W7 5 TK2 0 W8 7 TK2 3 W1 6 TK3 0 W2 3 TK3 5 W3 1 TK3 1 W4 2 TK3 0 W5 4 TK3 0 W6 8

Tidyr's gather() with NAs

拥有回忆 提交于 2019-12-23 19:07:32
问题 I am using tidyr and lubridate to convert a wide table to a long table. The following works just fine. > (df <- data.frame(hh_id = 1:2, bday_01 = ymd(20150309), bday_02 = ymd(19850911), bday_03 = ymd(19801231))) hh_id bday_01 bday_02 bday_03 1 1 2015-03-09 1985-09-11 1980-12-31 2 2 2015-03-09 1985-09-11 1980-12-31 > gather(df, person_num, bday, starts_with("bday_0")) hh_id person_num bday 1 1 bday_01 2015-03-09 2 2 bday_01 2015-03-09 3 1 bday_02 1985-09-11 4 2 bday_02 1985-09-11 5 1 bday_03

tidyr:Pivot_wider replace values with data type

冷暖自知 提交于 2019-12-23 16:36:45
问题 I have a data frame with variables in the rows and the columns that both contain variables, so I am trying to use pivot wide tidy the data. My data looks like the following: head(df) # A tibble: 6 x 4 State Year Var X <chr> <dbl> <chr> <dbl> 1 ALABAMA 2001 APPALACHIAN REGIONAL COMMISSION (ARC) 3048031 2 ALABAMA 2001 CORPORATION FOR NATIONAL AND COMMUNITY SERVICE (CNCS) 1765835 3 ALABAMA 2001 DEPARTMENT OF AGRICULTURE (USDA) 282530429 4 ALABAMA 2001 DEPARTMENT OF COMMERCE (DOC) 17838084 5