tidyr

Excluding multiple columns based on unquote-splicing (!!!)

ぃ、小莉子 提交于 2020-01-02 05:14:08
问题 Trying to exclude multiple columns in a call to tidyr::gather() which are served as inputs to my function via a character vector argument (output of shiny::selectInput ) instead of via ... in a programmatic way How would I do that with tidy eval functionality? Since I pass multiple column names via a single function argument, I thought I needed to use !!! (unquote-splicing) instead of !! as layed out in Programming with dplyr. But that doesn't seem to play nicely with tidyselect::vars_select(

Combining Multiple Columns with Tidyr's Unite by Referencing Similar Column Names

夙愿已清 提交于 2019-12-30 10:46:20
问题 library(tidyr) library(dplyr) library(tidyverse) Below is the code for a simple dataframe. I have some messy data that was exported with column factor categories spread out in different columns. Client<-c("Client1","Client2","Client3","Client4","Client5") Sex_M<-c("Male","NA","Male","NA","Male") Sex_F<-c(" ","Female"," ","Female"," ") Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied") Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ")

How do I select all unique combinations of two columns in an R data frame?

无人久伴 提交于 2019-12-30 07:41:13
问题 I have a correlation matrix that I put in a dataframe like so: row | var1 | var2 | cor 1 | A | B | 0.6 2 | B | A | 0.6 3 | A | C | 0.4 4 | C | A | 0.4 These results are duplicated into 2 rows each, with both combinations of "var1" and "var2". I only need one, preferably with the lower variable first (e.g. rows 1 and 3). I've been playing with dplyr for two hours and reading old threads, but not finding what I need. # get correlation of every concept versus every concept data.cor <- data.jobs

How do I select all unique combinations of two columns in an R data frame?

折月煮酒 提交于 2019-12-30 07:41:09
问题 I have a correlation matrix that I put in a dataframe like so: row | var1 | var2 | cor 1 | A | B | 0.6 2 | B | A | 0.6 3 | A | C | 0.4 4 | C | A | 0.4 These results are duplicated into 2 rows each, with both combinations of "var1" and "var2". I only need one, preferably with the lower variable first (e.g. rows 1 and 3). I've been playing with dplyr for two hours and reading old threads, but not finding what I need. # get correlation of every concept versus every concept data.cor <- data.jobs

group_by() into fill() not working as expected

喜你入骨 提交于 2019-12-30 03:11:48
问题 I'm trying to do a Last Observation Carried Forward operation on some poorly formatted data using dplyr and tidyr . It isn't working as I'd expect. library(dplyr) library(tidyr) df <- data.frame(id=c(1,1,2,2,3,3), email=c('bob@email.com', NA, 'joe@email.com', NA, NA, NA)) df2 <- df %>% group_by(id) %>% fill(email) This results in: Source: local data frame [6 x 2] Groups: id [3] id email (dbl) (fctr) 1 1 bob@email.com 2 1 bob@email.com 3 2 joe@email.com 4 2 joe@email.com 5 3 joe@email.com 6 3

Using dplyr window functions to calculate percentiles

这一生的挚爱 提交于 2019-12-29 10:14:35
问题 I have a working solution but am looking for a cleaner, more readable solution that perhaps takes advantage of some of the newer dplyr window functions. Using the mtcars dataset, if I want to look at the 25th, 50th, 75th percentiles and the mean and count of miles per gallon ("mpg") by the number of cylinders ("cyl"), I use the following code: library(dplyr) library(tidyr) # load data data("mtcars") # Percentiles used in calculation p <- c(.25,.5,.75) # old dplyr solution mtcars %>% group_by

De-aggregate / reverse-summarise / expand a dataset in R

一曲冷凌霜 提交于 2019-12-29 07:41:58
问题 My data looks like this: data("Titanic") df <- as.data.frame(Titanic) How can I de-aggregate or reverse-summarise count/freq and expand the data set back to it's original non-count observation state? For instance, I want 3rd, Male, Child, No repeated 35 times and 1st, Female, Adult, Yes repeated 140 times, etc, etc, in the dataframe. Thanks in advance. 回答1: Without packages we can repeat each row according to the frequencies given: df2 <- df[rep(1:nrow(df), df[,5]),-5] 回答2: You can do this

Tidyr how to spread into count of occurrence [duplicate]

喜欢而已 提交于 2019-12-28 06:48:52
问题 This question already has answers here : How do I get a contingency table? (6 answers) Faster ways to calculate frequencies and cast from long to wide (4 answers) Closed last year . Have a data frame like this other=data.frame(name=c("a","b","a","c","d"),result=c("Y","N","Y","Y","N")) How can I use spread function in tidyr or other function to get the count of result Y or N as column header like this name Y N a 2 0 b 0 1 Thanks 回答1: These are a few ways of many to go about it: 1) With library

Unlisting columns by groups

孤街浪徒 提交于 2019-12-28 04:15:13
问题 I have a dataframe in the following format: id | name | logs ---+--------------------+----------------------------------------- 84 | "zibaroo" | "C47931038" 12 | "fabien kelyarsky" | c("C47331040", "B19412225", "B18511449") 96 | "mitra lutsko" | c("F19712226", "A18311450") 34 | "PaulSandoz" | "A47431044" 65 | "BeamVision" | "D47531045" As you see the column "logs" includes vectors of strings in each cell. Is there an efficient way to convert the data frame to the long format (one observation

tidyr pivot_longer: handling multiple observations and values per row [duplicate]

江枫思渺然 提交于 2019-12-25 18:29:00
问题 This question already has answers here : Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed 14 days ago . I have an excel file I need to read that has multiple observations and values per row, with complicated names. It looks something like this when you load in: library(tidyverse) library(janitor) # An input table read from xlsx, with a format similar to this # An input table read from xlsx, with a format similar to this input