tidyverse

Estimating multiple `lm` models and returning output in one table, with map()

ぃ、小莉子 提交于 2020-01-24 12:32:55
问题 I need to estimate a number of linear models on the same dataset, and put the regression results all into one table. For a reproducible example, here's a simplification using mtcars : formula_1 = "mpg ~ disp" formula_2 = "mpg ~ log(disp)" formula_3 = "mpg ~ disp + hp" Currently, my approach has been to: Create a list that contains all of the formulae. use purrr:map() to estimate all of the lm models. use stargazer:: to produce output tables. library(tidyverse) library(stargazer) formula_1 =

Bootstrapping by multiple groups in the tidyverse: rsample vs. broom

前提是你 提交于 2020-01-24 12:10:14
问题 In this SO Question bootstrapping by several groups and subgroups seemed to be easy using the broom::bootstrap function specifying the by_group argument with TRUE . My desired output is a nested tibble with n rows where the data column contains the bootstrapped data generated by each bootstrap call (and each group and subgroup has the same amount of cases as in the original data). In broom I did the following: # packages library(dplyr) library(purrr) library(tidyr) library(tibble) library

How to manipulate (aggregate) the data in R?

我与影子孤独终老i 提交于 2020-01-24 11:53:21
问题 I have a data set as I've shown below: df <- tribble( ~id, ~price, ~number_of_book, "1", 10, 3, "1", 5, 1, "2", 7, 4, "2", 6, 2, "2", 3, 4, "3", 4, 1, "4", 5, 1, "4", 6, 1, "5", 1, 2, "5", 9, 3, ) As you see in the data set, there are 3 books which cost 10 dollar for each book if id is "1" and 1 book that costs 5 dollar. Basically, I want to see the share (%) the number of books for each price bin. Here is my desired data set: df <- tribble( ~id, ~less_than_three, ~three-five, ~five-six,

How to manipulate (aggregate) the data in R?

↘锁芯ラ 提交于 2020-01-24 11:51:39
问题 I have a data set as I've shown below: df <- tribble( ~id, ~price, ~number_of_book, "1", 10, 3, "1", 5, 1, "2", 7, 4, "2", 6, 2, "2", 3, 4, "3", 4, 1, "4", 5, 1, "4", 6, 1, "5", 1, 2, "5", 9, 3, ) As you see in the data set, there are 3 books which cost 10 dollar for each book if id is "1" and 1 book that costs 5 dollar. Basically, I want to see the share (%) the number of books for each price bin. Here is my desired data set: df <- tribble( ~id, ~less_than_three, ~three-five, ~five-six,

Summarize data at different aggregate levels - R and tidyverse

烈酒焚心 提交于 2020-01-24 03:21:06
问题 I'm creating a bunch of basic status reports and one of things I'm finding tedious is adding a total row to all my tables. I'm currently using the Tidyverse approach and this is an example of my current code. What I'm looking for is an option to have a few different levels included by default. #load into RStudio viewer (not required) iris = iris #summary at the group level summary_grouped = iris %>% group_by(Species) %>% summarize(mean_s_length = mean(Sepal.Length), max_s_width = max(Sepal

tidyr: using mutate inside a function

落花浮王杯 提交于 2020-01-23 06:47:05
问题 I'd like to use mutate function from the tidyverse to create a new column based on the old column using only a data frame and strings, which represent column headers, as inputs. I can get this to work without using the tidyverse (see function f below), but I'd like to get it to work using the tidyverse (see function f.tidy below) Can someone please post a solution for adding this column using mutate called from a inside function? df <- data.frame('test' = 1:3, 'tcy' = 4:6) # test tcy # 1 4 #

How to create a variable that indicate where the data from in every list elements in r

a 夏天 提交于 2020-01-22 02:16:05
问题 My question is what I said in the title, and I found Duck's question same as mine (How to create in R new variable for each element in a list of data frames with the name of data frame and its value equal to position of the element). Within my poor knowledge in R, I can't understand well the code though it really got what I wanted. I know my code can't run but I though the code should like this : # create a fake data df1 <- split(mtcars,mtcars$cyl) # add a new variable that indicate where the

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

佐手、 提交于 2020-01-21 10:09:12
问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

六月ゝ 毕业季﹏ 提交于 2020-01-21 10:09:07
问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes

Show percent of total on top of geom_bar in ggplot2 while showing counts on y axis

一世执手 提交于 2020-01-21 10:09:05
问题 I'm trying to create a bar plot with ggplot2, showing counts on the y axis, but also the percents of total on top of each bar. I've calculated the counts and percents of total, but can't figure out how to add the percents total on top of the bars. I'm trying to use geom_text, but not able to get it work. A minimal example: iris %>% group_by(Species) %>% summarize(count = n()) %>% mutate(percent = count/sum(count)) %>% ggplot(aes(x=Species, y=count)) + geom_bar(stat="identity") + geom_text(aes