tidyverse

Applying yearwise segmented regression in R

随声附和 提交于 2020-07-10 10:26:33
问题 I have daily rainfall data which I have converted to yearwise cumulative value using following code library(seas) library(data.table) library(ggplot2) #Loading data data(mscdata) dat <- (mksub(mscdata, id=1108447)) dat$julian.date <- as.numeric(format(dat$date, "%j")) DT <- data.table(dat) DT[, Cum.Sum := cumsum(rain), by=list(year)] df <- cbind.data.frame(day=dat$julian.date,cumulative=DT$Cum.Sum) Then I want to apply segmented regression year-wise to have year-wise breakpoints. I could able

Why is `str_extract` only catching some of these values?

血红的双手。 提交于 2020-07-07 09:56:29
问题 I have a table that has a "membership type" column that includes a zillion different membership levels that we've used over the years. example <-data.frame(membership = c( "Legacy Payment ID #3564, Payment Record #0, Period Paid: 1 Flag: N", "Legacy Payment ID #3611, Payment Record #0, Period Paid: 2 Flag: N", "Legacy Payment ID #4105, Payment Record #0, Period Paid: 1 Flag: G", "Legacy Payment ID #4136, Payment Record #0, Period Paid: 1 Flag: N", "Legacy Payment ID #5191, Payment Record #0,

How to avoid excessive lambda functions in pandas DataFrame assign and apply method chains

蓝咒 提交于 2020-07-06 12:11:14
问题 I am trying to translate a pipeline of manipulations on a dataframe in R over to its Python equivalent. A basic example of the pipeline is as follows, incorporating a few mutate and filter calls: library(tidyverse) calc_circle_area <- function(diam) pi / 4 * diam^2 calc_cylinder_vol <- function(area, length) area * length raw_data <- tibble(cylinder_name=c('a', 'b', 'c'), length=c(3, 5, 9), diam=c(1, 2, 4)) new_table <- raw_data %>% mutate(area = calc_circle_area(diam)) %>% mutate(vol = calc

How to avoid excessive lambda functions in pandas DataFrame assign and apply method chains

℡╲_俬逩灬. 提交于 2020-07-06 12:10:54
问题 I am trying to translate a pipeline of manipulations on a dataframe in R over to its Python equivalent. A basic example of the pipeline is as follows, incorporating a few mutate and filter calls: library(tidyverse) calc_circle_area <- function(diam) pi / 4 * diam^2 calc_cylinder_vol <- function(area, length) area * length raw_data <- tibble(cylinder_name=c('a', 'b', 'c'), length=c(3, 5, 9), diam=c(1, 2, 4)) new_table <- raw_data %>% mutate(area = calc_circle_area(diam)) %>% mutate(vol = calc

Unnesting tibble columns: “Wide” data summaries with dplyr v1.0.0

别说谁变了你拦得住时间么 提交于 2020-07-06 01:56:40
问题 I'd like to produce "wide" summary tables of data in this sort of format: ---- Centiles ---- Param Group Mean SD 25% 50% 75% Height 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx Weight 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx I can do that in dplyr 0.8.x. I can do it generically, with a function that can handle arbitrary grouping variables with arbitrary numbers of levels and arbitrary statistics summarising

Unnesting tibble columns: “Wide” data summaries with dplyr v1.0.0

ε祈祈猫儿з 提交于 2020-07-06 01:55:32
问题 I'd like to produce "wide" summary tables of data in this sort of format: ---- Centiles ---- Param Group Mean SD 25% 50% 75% Height 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx Weight 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx I can do that in dplyr 0.8.x. I can do it generically, with a function that can handle arbitrary grouping variables with arbitrary numbers of levels and arbitrary statistics summarising

Unnesting tibble columns: “Wide” data summaries with dplyr v1.0.0

江枫思渺然 提交于 2020-07-06 01:55:07
问题 I'd like to produce "wide" summary tables of data in this sort of format: ---- Centiles ---- Param Group Mean SD 25% 50% 75% Height 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx Weight 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx I can do that in dplyr 0.8.x. I can do it generically, with a function that can handle arbitrary grouping variables with arbitrary numbers of levels and arbitrary statistics summarising

Unnesting tibble columns: “Wide” data summaries with dplyr v1.0.0

你。 提交于 2020-07-06 01:53:18
问题 I'd like to produce "wide" summary tables of data in this sort of format: ---- Centiles ---- Param Group Mean SD 25% 50% 75% Height 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx Weight 1 x.xx x.xxx x.xx x.xx x.xx 2 x.xx x.xxx x.xx x.xx x.xx 3 x.xx x.xxx x.xx x.xx x.xx I can do that in dplyr 0.8.x. I can do it generically, with a function that can handle arbitrary grouping variables with arbitrary numbers of levels and arbitrary statistics summarising

Creating a dynamic Group By

吃可爱长大的小学妹 提交于 2020-07-05 04:39:05
问题 df = data.frame( A = c(1, 4, 5, 13, 2), B = c("Group 1", "Group 3", "Group 2", "Group 1", "Group 2"), C = c("Group 3", "Group 2", "Group 1", "Group 2", "Group 3") ) df %>% group_by(B) %>% summarise(val = mean(A)) df %>% group_by(C) %>% summarise(val = mean(A)) Instead of writing a new chunck of code for each unique set of group_by I would like to create a loop that would iterate through the df data frame and save the results into a list or a data frame. I would like to see how the average

combine two data frames with all posible combinations

♀尐吖头ヾ 提交于 2020-07-01 07:35:18
问题 I have 2 data frames. How I can make something like tidyr::complete with them using tidyverse ? My data: df <-data.frame(a=letters[1:2] ) df1<-data.frame(one=1:2) Expected Result: a 1 b 1 a 2 b 2 Thx! 回答1: With this particular example I think you can just use the merge function. As a standard its arguments all.x and all.y are set to TRUE, so it automatically creates all combinations since the dataframes do not have any variables or values in common. df <-data.frame(a=letters[1:10] ) df1<-data