dplyr

Many regressions using tidyverse and broom: Same dependent variable, different independent variables

ぐ巨炮叔叔 提交于 2021-02-08 04:49:33
问题 This link shows how to answer my question in the case where we have the same independent variables, but potentially many different dependent variables: Use broom and tidyverse to run regressions on different dependent variables. But my question is, how can I apply the same approach (e.g., tidyverse and broom) to run many regressions where we have the reverse situation: same dependent variables but different independent variable. In line with the code in the previous link, something like: mod

compare communities from graphs with different number of vertices

限于喜欢 提交于 2021-02-08 04:45:58
问题 I am calculating louvain communities on graphs of communications data, where vertices represent performers on a big project. The graphs represent different communication methods (e.g., email, phone). We want to try to identify teams of performers from their communication data. Since performers have preferences for different communication methods, the graphs are of different sizes and may have some unique vertices which may not be present in both. When I try to compare the community objects

dplyr left_join with timeline and dates

不羁岁月 提交于 2021-02-08 04:41:21
问题 I want merge data from a filtered set into a timeline I created with the help of the timeline package. df1 looks like Date Label Freq 2011-03-12 1 18 2011-03-14 1 16 2011-03-18 1 5 time line produces a vector with dates from a specific starting date until a specified end date. What I want to achieve is a timeline with all days in a certain period. Then I want to merge df1 into timeline. Using left_join from dplyr I first get Error in UseMethod("left_join") : not applicable for 'left_join' for

left_join R dataframes, merging two columns with NAs

安稳与你 提交于 2021-02-08 03:32:33
问题 My problem is the following: Lets say I have an existing dataframe with the following columns: UID, foo, result. Result is already partially filled. A second model now predicts additional rows, generating a second dataframe containing a UID and a result column: (Code to reproduce at bottom) ## df_main ## UID foo result ## <dbl> <chr> <chr> ## 1 1 moo Cow ## 2 2 rum <NA> ## 3 3 oink <NA> ## 4 4 woof Dog ## 5 5 hiss <NA> ## new_prediction ## UID result ## <dbl> <chr> ## 1 3 Pig ## 2 5 Snake I

left_join R dataframes, merging two columns with NAs

人走茶凉 提交于 2021-02-08 03:31:31
问题 My problem is the following: Lets say I have an existing dataframe with the following columns: UID, foo, result. Result is already partially filled. A second model now predicts additional rows, generating a second dataframe containing a UID and a result column: (Code to reproduce at bottom) ## df_main ## UID foo result ## <dbl> <chr> <chr> ## 1 1 moo Cow ## 2 2 rum <NA> ## 3 3 oink <NA> ## 4 4 woof Dog ## 5 5 hiss <NA> ## new_prediction ## UID result ## <dbl> <chr> ## 1 3 Pig ## 2 5 Snake I

Passing column name as parameter to a function using dplyr

天大地大妈咪最大 提交于 2021-02-08 02:12:06
问题 I have a dataframe like below : transid<-c(1,2,3,4,5,6,7,8) accountid<-c(a,a,b,a,b,b,a,b) month<-c(1,1,1,2,2,3,3,3) amount<-c(10,20,30,40,50,60,70,80) transactions<-data.frame(transid,accountid,month,amount) I am trying to write function for total monthly amount for each accountid using dplyr package verbs. my_sum<-function(df,col1,col2,col3){ df %>% group_by_(col1,col2) %>%summarise_(total_sum = sum(col3)) } my_sum(transactions, "accountid","month","amount") To get the result like below:

Passing column name as parameter to a function using dplyr

吃可爱长大的小学妹 提交于 2021-02-08 02:04:49
问题 I have a dataframe like below : transid<-c(1,2,3,4,5,6,7,8) accountid<-c(a,a,b,a,b,b,a,b) month<-c(1,1,1,2,2,3,3,3) amount<-c(10,20,30,40,50,60,70,80) transactions<-data.frame(transid,accountid,month,amount) I am trying to write function for total monthly amount for each accountid using dplyr package verbs. my_sum<-function(df,col1,col2,col3){ df %>% group_by_(col1,col2) %>%summarise_(total_sum = sum(col3)) } my_sum(transactions, "accountid","month","amount") To get the result like below:

compute pointwise distance by group in R with sf dplyr

怎甘沉沦 提交于 2021-02-07 23:02:14
问题 I have 2 dataframes. I want to compute the distance between all POINT geometries if the first frame with respect to a certain POINT in the second dataframe. The main feature of this problem is that I have a grouping variable in the first dataframe, and I would like to select the corresponding point to measure the distance to (in the second dataframe) according to this grouping indicator. I tried with group_by : library(sf) library(dplyr) d = data.frame(x = 1:10,y = 1:10, g = rep(c("a","b")

compute pointwise distance by group in R with sf dplyr

烈酒焚心 提交于 2021-02-07 23:01:14
问题 I have 2 dataframes. I want to compute the distance between all POINT geometries if the first frame with respect to a certain POINT in the second dataframe. The main feature of this problem is that I have a grouping variable in the first dataframe, and I would like to select the corresponding point to measure the distance to (in the second dataframe) according to this grouping indicator. I tried with group_by : library(sf) library(dplyr) d = data.frame(x = 1:10,y = 1:10, g = rep(c("a","b")

Create t.test table with dplyr?

ⅰ亾dé卋堺 提交于 2021-02-07 20:25:41
问题 Suppose I have data that looks like this: set.seed(031915) myDF <- data.frame( Name= rep(c("A", "B"), times = c(10,10)), Group = rep(c("treatment", "control", "treatment", "control"), times = c(5,5,5,5)), X = c(rnorm(n=5,mean = .05, sd = .001), rnorm(n=5,mean = .02, sd = .001), rnorm(n=5,mean = .08, sd = .02), rnorm(n=5,mean = .03, sd = .02)) ) I want to create a t.test table with a row for "A" and one for "B" I can write my own function that does that: ttestbyName <- function(Name) { b <- t