purrr | 易学教程

Row-wise iteration like apply with purrr

阅读更多关于 Row-wise iteration like apply with purrr

问题 How do I achieve row-wise iteration using purrr::map? Here's how I'd do it with a standard row-wise apply. df <- data.frame(a = 1:10, b = 11:20, c = 21:30) lst_result <- apply(df, 1, function(x){ var1 <- (x[['a']] + x[['b']]) var2 <- x[['c']]/2 return(data.frame(var1 = var1, var2 = var2)) }) However, this is not too elegant, and I would rather do it with purrr. May (or may not) be faster, too. 回答1: You can use pmap for row-wise iteration. The columns are used as the arguments of whatever

Using purrr::map to iterate linear model over columns in data frame

阅读更多关于 Using purrr::map to iterate linear model over columns in data frame

I am trying to do an exercise to become more familiar with how to use the map function in purrr. I am creating some random data (10 columns of 10 datapoints) and then I wanted to use map to perform a series of regressions (i.e. lm(y ~ x, data = )) over the resulting columns in the data frame. If I just repeatedly use the first column as 'y', I want to perform 10 regressions with each column from 1 to 10 as 'x'. Obviously the results are unimportant - it's just the method. I want to end up with a list of 10 linear model objects. list_of_vecs <- list() for (i in 1:10){ list_of_vecs[[paste('vec_'

map a vector of characters to lm formula in r

阅读更多关于 map a vector of characters to lm formula in r

I'm trying to make a list of lm object using purrr::map. use mtcars as an example: vars <- c('hp', 'wt', 'disp') map(vars, ~lm(mpg~.x, data=mtcars)) error: Error in model.frame.default(formula = mpg ~ .x, data = mtcars, drop.unused.levels = TRUE) : variable lengths differ (found for '.x') I also tried: map(vars, function(x) {x=sym(x); lm(mpg~!!x, data=mtcars)}) I got error message: Error in !x : invalid argument type Can anyone tell what I did wrong? Thanks in advance. The usual way is to paste together formulas as strings, convert them by map ping as.formula (you can't make a vector of

Function for Tidy chisq.test Output for Visualizing or Filtering P-Values

阅读更多关于 Function for Tidy chisq.test Output for Visualizing or Filtering P-Values

问题 For data... library(productplots) library(ggmosaic) For code... library(tidyverse) library(broom) I'm trying to create tidy chisq.test output so that I can easily filter or visualize p-values. I'm using the "happy" dataset (which is included with either of the packages listed above) For this example, if I wanted to condition the "happy" variable on all other variables,I would isolate the categorical variables (I'm not going to create factor groupings out of age, year, etc, for this example),

Why is split inefficient on large data frames with many groups?

阅读更多关于 Why is split inefficient on large data frames with many groups?

问题 df %>% split(.$x) becomes slow for large number of unique values of x. If we instead split the data frame manually into smaller subsets and then perform split on each subset we reduce the time by at least an order of magnitude. library(dplyr) library(microbenchmark) library(caret) library(purrr) N <- 10^6 groups <- 10^5 df <- data.frame(x = sample(1:groups, N, replace = TRUE), y = sample(letters, N, replace = TRUE)) ids <- df$x %>% unique folds10 <- createFolds(ids, 10) folds100 <-

advice on Usage of dplyr:: do vs purrr: map, tidy::nest, for predictions

阅读更多关于 advice on Usage of dplyr:: do vs purrr: map, tidy::nest, for predictions

I just came across the the purrr package and I think this would help me out a bit in terms of what I want to do - I just can't put it together. I think this is going to be along post but goes over a common use case I think many others run into so hopefully this is of use to them as well. This is what I'm aiming for: From one big dataset run multiple models on each of the different subgroups. Have these models readily available so I can examine - for coeffients, accuracy, etc. From this saved model list for each of the different groupings, be able to apply the corresponding model to the

Handling vectors of different lengths in purrr

阅读更多关于 Handling vectors of different lengths in purrr

I currently have the following R code that runs multiple regression models with different predictors, across different subsets, and returns tidied output using the broom package. library(dplyr) library(purrr) library(broom) cars <- mtcars preds<-c("disp", "drat", "wt") model_fits <- map_df(preds, function(pred) { model_formula <- sprintf("mpg ~ %s", pred) cars %>% group_by(cyl) %>% do(tidy(lm(model_formula, data = .), conf.int = T)) %>% filter(term == pred) %>% mutate(outcome = "mpg") %>% select(outcome, cyl:estimate, starts_with("conf.")) }) This results in the following data frame: > model

R - Parallelizing multiple model learning (with dplyr and purrr)

阅读更多关于 R - Parallelizing multiple model learning (with dplyr and purrr)

问题 This is a follow up to a previous question about learning multiple models. The use case is that I have multiple observations for each subject, and I want to train a model for each of them. See Hadley's excellent presentation on how to do this. In short, this is possible to do using dplyr and purrr like so: library(purrr) library(dplyr) library(fitdistrplus) dt %>% split(dt$subject_id) %>% map( ~ fitdist(.$observation, "norm")) So since the model building is an embarrassingly parallel task, I

why does map_if() not work within a list

阅读更多关于 why does map_if() not work within a list

Please help me 1) Why does map_if not work within a list 2) Is there a way to make it work 3) If not, what are the alternatives Thanks in advance. library(dplyr) library(purrr) cyl <- split(mtcars, mtcars$cyl) # This works map_if(mtcars, is.numeric, mean) # This does not work map_if(cyl, is.numeric, mean) Because you need to map to one lever lower, the columns are at level 2. So you can do: map(cyl, ~map_if(., is.numeric, mean)) Or: map(cyl, map_if, is.numeric, mean) Without the if one could do map_depth(cyl, 2, mean) count You can try lapply : lapply(cyl, function(x) map_if(x, is.numeric,

Use filter() (and other dplyr functions) inside nested data frames with map()

阅读更多关于 Use filter() (and other dplyr functions) inside nested data frames with map()

I'm trying to use map() of purrr package to apply filter() function to the data stored in a nested data frame. "Why wouldn't you filter first, and then nest? - you might ask. That will work (and I'll show my desired outcome using such process), but I'm looking for ways to do it with purrr . I want to have just one data frame, with two list-columns, both being nested data frames - one full and one filtered. I can achieve it now by performing nest() twice: once on all data, and second on filtered data: library(tidyverse) df <- tibble( a = sample(x = rep(c('x','y'),5), size = 10), b = sample(c(1