tidyverse | 易学教程

Estimating multiple `lm` models and returning output in one table, with map()

阅读更多关于 Estimating multiple `lm` models and returning output in one table, with map()

问题 I need to estimate a number of linear models on the same dataset, and put the regression results all into one table. For a reproducible example, here's a simplification using mtcars : formula_1 = "mpg ~ disp" formula_2 = "mpg ~ log(disp)" formula_3 = "mpg ~ disp + hp" Currently, my approach has been to: Create a list that contains all of the formulae. use purrr:map() to estimate all of the lm models. use stargazer:: to produce output tables. library(tidyverse) library(stargazer) formula_1 =

Bootstrapping by multiple groups in the tidyverse: rsample vs. broom

阅读更多关于 Bootstrapping by multiple groups in the tidyverse: rsample vs. broom

问题 In this SO Question bootstrapping by several groups and subgroups seemed to be easy using the broom::bootstrap function specifying the by_group argument with TRUE . My desired output is a nested tibble with n rows where the data column contains the bootstrapped data generated by each bootstrap call (and each group and subgroup has the same amount of cases as in the original data). In broom I did the following: # packages library(dplyr) library(purrr) library(tidyr) library(tibble) library

How to manipulate (aggregate) the data in R?

阅读更多关于 How to manipulate (aggregate) the data in R?

问题 I have a data set as I've shown below: df <- tribble( ~id, ~price, ~number_of_book, "1", 10, 3, "1", 5, 1, "2", 7, 4, "2", 6, 2, "2", 3, 4, "3", 4, 1, "4", 5, 1, "4", 6, 1, "5", 1, 2, "5", 9, 3, ) As you see in the data set, there are 3 books which cost 10 dollar for each book if id is "1" and 1 book that costs 5 dollar. Basically, I want to see the share (%) the number of books for each price bin. Here is my desired data set: df <- tribble( ~id, ~less_than_three, ~three-five, ~five-six,

How to manipulate (aggregate) the data in R?

阅读更多关于 How to manipulate (aggregate) the data in R?

Summarize data at different aggregate levels - R and tidyverse

阅读更多关于 Summarize data at different aggregate levels - R and tidyverse

问题 I'm creating a bunch of basic status reports and one of things I'm finding tedious is adding a total row to all my tables. I'm currently using the Tidyverse approach and this is an example of my current code. What I'm looking for is an option to have a few different levels included by default. #load into RStudio viewer (not required) iris = iris #summary at the group level summary_grouped = iris %>% group_by(Species) %>% summarize(mean_s_length = mean(Sepal.Length), max_s_width = max(Sepal

tidyr: using mutate inside a function

阅读更多关于 tidyr: using mutate inside a function

问题 I'd like to use mutate function from the tidyverse to create a new column based on the old column using only a data frame and strings, which represent column headers, as inputs. I can get this to work without using the tidyverse (see function f below), but I'd like to get it to work using the tidyverse (see function f.tidy below) Can someone please post a solution for adding this column using mutate called from a inside function? df <- data.frame('test' = 1:3, 'tcy' = 4:6) # test tcy # 1 4 #

How to create a variable that indicate where the data from in every list elements in r

阅读更多关于 How to create a variable that indicate where the data from in every list elements in r

问题 My question is what I said in the title, and I found Duck's question same as mine (How to create in R new variable for each element in a list of data frames with the name of data frame and its value equal to position of the element). Within my poor knowledge in R, I can't understand well the code though it really got what I wanted. I know my code can't run but I though the code should like this : # create a fake data df1 <- split(mtcars,mtcars$cyl) # add a new variable that indicate where the

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

阅读更多关于 Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

阅读更多关于 Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

阅读更多关于 Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis