purrr

Saving deeply nested files to specific directories with specific filenames

风格不统一 提交于 2019-12-23 04:35:07
问题 Given a 3 level nested list: mylist <- list("1000"=list("cars"=list("fast"=mtcars[1:10,], "slow"=mtcars[11:15,]), "flower"=iris), "2000"=list("tooth"=ToothGrowth, "air"=airquality, "cars"=list("cruiser"=mtcars[5:12,], "fast"=mtcars[1:3,], "mild"=mtcars[9:18,]))) (ie: mylist$1000$cars$fast , where fast is a dataframe, and cars and 1000 are nested lists in mylist ) I'd like to save each innermost dataframe, (ie: fast ) as a .csv with the df name as it's file name, ie: fast.csv , and I want the

Saving deeply nested files to specific directories with specific filenames

我与影子孤独终老i 提交于 2019-12-23 04:34:02
问题 Given a 3 level nested list: mylist <- list("1000"=list("cars"=list("fast"=mtcars[1:10,], "slow"=mtcars[11:15,]), "flower"=iris), "2000"=list("tooth"=ToothGrowth, "air"=airquality, "cars"=list("cruiser"=mtcars[5:12,], "fast"=mtcars[1:3,], "mild"=mtcars[9:18,]))) (ie: mylist$1000$cars$fast , where fast is a dataframe, and cars and 1000 are nested lists in mylist ) I'd like to save each innermost dataframe, (ie: fast ) as a .csv with the df name as it's file name, ie: fast.csv , and I want the

Web scraping the data behind every url from a list of urls

房东的猫 提交于 2019-12-23 02:25:20
问题 I am trying to gather a dataset from this site called ICObench. I've managed to extract the names of each ICO in the 91 pages using rvest and purrr, but Im confused as to how I can extract data behind each name in the list. All the names are clickable links. This is the code so far: url_base <- "https://icobench.com/icos?page=%d&filterBonus=&filterBounty=&filterTeam=&filterExpert=&filterSort=&filterCategory=all&filterRating=any&filterStatus=ended&filterCountry=any&filterRegistration=0

How to name a dataframe so that I can look for it within a list

偶尔善良 提交于 2019-12-23 01:08:09
问题 I have a function that returns a dataframe. I use this function with furrr::future_map2 so that I get a list with several dataframes. What I want is the ability to use the name input in the function to name the dataframe so that I can search the return list by name. example test <- function(x, name){ require(tidyverse) z <- data.frame(x+1) %>% stats::setNames(., "a") return(z) } furrr::future_map2(1:3, c("a", "b", "c"), ~test(.x, .y)) The first df within the list would be a , the second b and

use pmap() to calculate row means of several columns

旧巷老猫 提交于 2019-12-22 13:13:20
问题 I'm trying to better understand how pmap() works within dataframes, and I get a surprising result when applying pmap() to compute means from several columns. mtcars %>% mutate(comp_var = pmap_dbl(list(vs, am, cyl), mean)) %>% select(comp_var, vs, am, cyl) In the above example, comp_var is equal to the value of vs in its row, rather than the mean of the three variables in a given row. I know that I could get accurate results for comp_var using ... mtcars %>% rowwise() %>% mutate(comp_var =

R function using . and ~

淺唱寂寞╮ 提交于 2019-12-22 13:04:56
问题 I'm trying to learn to use ~ and . in R. In the code below is the same function written with and without the use of ~ and . .I didn't understand what happened in the first function to appear the error. #FIRST FUNCTION col_summary2 <- function(.x, .f, ...){ .x <- purrr::keep(.x, is.numeric) purrr::map_dbl(.x, ~.f(., ...)) } col_summary2(mtcars,mean) #Error in mean.default(., ...) : 'trim' must be numeric of length one #SECOND FUNCTION col_summary2 <- function(.x, .f, ...){ .x <- purrr::keep(.x

How do pipes work with purrr map() function and the “.” (dot) symbol

萝らか妹 提交于 2019-12-22 06:29:37
问题 When using both pipes and the map() function from purrr, I am confused about how data and variables are passed along. For instance, this code works as I expect: library(tidyverse) cars %>% select_if(is.numeric) %>% map(~hist(.)) Yet, when I try something similar using ggplot, it behaves in a strange way. cars %>% select_if(is.numeric) %>% map(~ggplot(cars, aes(.)) + geom_histogram()) I'm guessing this is because the "." in this case is passing a vector to aes(), which is expecting a column

Double nesting in the tidyverse

China☆狼群 提交于 2019-12-21 05:17:13
问题 Using the examples from Wickhams introduction to purrr in R for data science, I am trying to create a double nested list. library(gapminder) library(purrr) library(tidyr) gapminder nest_data <- gapminder %>% group_by(continent) %>% nest(.key = by_continent) How can I further nest the countries so that nest_data contains by_continent and a new level of nesting by_contry that ultimately includes the tibble by_year? Furthermore, after creating this datastructure for the gapminder data - how

Use filter() (and other dplyr functions) inside nested data frames with map()

走远了吗. 提交于 2019-12-21 03:55:26
问题 I'm trying to use map() of purrr package to apply filter() function to the data stored in a nested data frame. "Why wouldn't you filter first, and then nest? - you might ask. That will work (and I'll show my desired outcome using such process), but I'm looking for ways to do it with purrr . I want to have just one data frame, with two list-columns, both being nested data frames - one full and one filtered. I can achieve it now by performing nest() twice: once on all data, and second on

How to fork/parallelize process in purrr::pmap

若如初见. 提交于 2019-12-20 14:40:08
问题 I have the following code that does serial processing with purr::pmap library(tidyverse) set.seed(1) params <- tribble( ~mean, ~sd, ~n, 5, 1, 1, 10, 5, 3, -3, 10, 5 ) params %>% pmap(rnorm) #> [[1]] #> [1] 4.373546 #> #> [[2]] #> [1] 10.918217 5.821857 17.976404 #> #> [[3]] #> [1] 0.2950777 -11.2046838 1.8742905 4.3832471 2.7578135 How can I parallelize (fork) the process above so that it runs faster and produces identical result? Here, I use rnorm for illustration purpose, in reality I have