tidyr | 易学教程

Excluding multiple columns based on unquote-splicing (!!!)

阅读更多关于 Excluding multiple columns based on unquote-splicing (!!!)

问题 Trying to exclude multiple columns in a call to tidyr::gather() which are served as inputs to my function via a character vector argument (output of shiny::selectInput ) instead of via ... in a programmatic way How would I do that with tidy eval functionality? Since I pass multiple column names via a single function argument, I thought I needed to use !!! (unquote-splicing) instead of !! as layed out in Programming with dplyr. But that doesn't seem to play nicely with tidyselect::vars_select(

Combining Multiple Columns with Tidyr's Unite by Referencing Similar Column Names

阅读更多关于 Combining Multiple Columns with Tidyr's Unite by Referencing Similar Column Names

问题 library(tidyr) library(dplyr) library(tidyverse) Below is the code for a simple dataframe. I have some messy data that was exported with column factor categories spread out in different columns. Client<-c("Client1","Client2","Client3","Client4","Client5") Sex_M<-c("Male","NA","Male","NA","Male") Sex_F<-c(" ","Female"," ","Female"," ") Satisfaction_Satisfied<-c("Satisfied"," "," ","Satisfied","Satisfied") Satisfaction_VerySatisfied<-c(" ","VerySatisfied","VerySatisfied"," "," ")

How do I select all unique combinations of two columns in an R data frame?

阅读更多关于 How do I select all unique combinations of two columns in an R data frame?

问题 I have a correlation matrix that I put in a dataframe like so: row | var1 | var2 | cor 1 | A | B | 0.6 2 | B | A | 0.6 3 | A | C | 0.4 4 | C | A | 0.4 These results are duplicated into 2 rows each, with both combinations of "var1" and "var2". I only need one, preferably with the lower variable first (e.g. rows 1 and 3). I've been playing with dplyr for two hours and reading old threads, but not finding what I need. # get correlation of every concept versus every concept data.cor <- data.jobs

How do I select all unique combinations of two columns in an R data frame?

阅读更多关于 How do I select all unique combinations of two columns in an R data frame?

group_by() into fill() not working as expected

阅读更多关于 group_by() into fill() not working as expected

问题 I'm trying to do a Last Observation Carried Forward operation on some poorly formatted data using dplyr and tidyr . It isn't working as I'd expect. library(dplyr) library(tidyr) df <- data.frame(id=c(1,1,2,2,3,3), email=c('bob@email.com', NA, 'joe@email.com', NA, NA, NA)) df2 <- df %>% group_by(id) %>% fill(email) This results in: Source: local data frame [6 x 2] Groups: id [3] id email (dbl) (fctr) 1 1 bob@email.com 2 1 bob@email.com 3 2 joe@email.com 4 2 joe@email.com 5 3 joe@email.com 6 3

Using dplyr window functions to calculate percentiles

阅读更多关于 Using dplyr window functions to calculate percentiles

问题 I have a working solution but am looking for a cleaner, more readable solution that perhaps takes advantage of some of the newer dplyr window functions. Using the mtcars dataset, if I want to look at the 25th, 50th, 75th percentiles and the mean and count of miles per gallon ("mpg") by the number of cylinders ("cyl"), I use the following code: library(dplyr) library(tidyr) # load data data("mtcars") # Percentiles used in calculation p <- c(.25,.5,.75) # old dplyr solution mtcars %>% group_by

De-aggregate / reverse-summarise / expand a dataset in R

阅读更多关于 De-aggregate / reverse-summarise / expand a dataset in R

问题 My data looks like this: data("Titanic") df <- as.data.frame(Titanic) How can I de-aggregate or reverse-summarise count/freq and expand the data set back to it's original non-count observation state? For instance, I want 3rd, Male, Child, No repeated 35 times and 1st, Female, Adult, Yes repeated 140 times, etc, etc, in the dataframe. Thanks in advance. 回答1: Without packages we can repeat each row according to the frequencies given: df2 <- df[rep(1:nrow(df), df[,5]),-5] 回答2: You can do this

Tidyr how to spread into count of occurrence [duplicate]

阅读更多关于 Tidyr how to spread into count of occurrence [duplicate]

问题 This question already has answers here : How do I get a contingency table? (6 answers) Faster ways to calculate frequencies and cast from long to wide (4 answers) Closed last year . Have a data frame like this other=data.frame(name=c("a","b","a","c","d"),result=c("Y","N","Y","Y","N")) How can I use spread function in tidyr or other function to get the count of result Y or N as column header like this name Y N a 2 0 b 0 1 Thanks 回答1: These are a few ways of many to go about it: 1) With library

Unlisting columns by groups

阅读更多关于 Unlisting columns by groups

tidyr pivot_longer: handling multiple observations and values per row [duplicate]

阅读更多关于 tidyr pivot_longer: handling multiple observations and values per row [duplicate]

问题 This question already has answers here : Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed 14 days ago . I have an excel file I need to read that has multiple observations and values per row, with complicated names. It looks something like this when you load in: library(tidyverse) library(janitor) # An input table read from xlsx, with a format similar to this # An input table read from xlsx, with a format similar to this input