dplyr | 易学教程

Many regressions using tidyverse and broom: Same dependent variable, different independent variables

阅读更多关于 Many regressions using tidyverse and broom: Same dependent variable, different independent variables

问题 This link shows how to answer my question in the case where we have the same independent variables, but potentially many different dependent variables: Use broom and tidyverse to run regressions on different dependent variables. But my question is, how can I apply the same approach (e.g., tidyverse and broom) to run many regressions where we have the reverse situation: same dependent variables but different independent variable. In line with the code in the previous link, something like: mod

compare communities from graphs with different number of vertices

阅读更多关于 compare communities from graphs with different number of vertices

问题 I am calculating louvain communities on graphs of communications data, where vertices represent performers on a big project. The graphs represent different communication methods (e.g., email, phone). We want to try to identify teams of performers from their communication data. Since performers have preferences for different communication methods, the graphs are of different sizes and may have some unique vertices which may not be present in both. When I try to compare the community objects

dplyr left_join with timeline and dates

阅读更多关于 dplyr left_join with timeline and dates

问题 I want merge data from a filtered set into a timeline I created with the help of the timeline package. df1 looks like Date Label Freq 2011-03-12 1 18 2011-03-14 1 16 2011-03-18 1 5 time line produces a vector with dates from a specific starting date until a specified end date. What I want to achieve is a timeline with all days in a certain period. Then I want to merge df1 into timeline. Using left_join from dplyr I first get Error in UseMethod("left_join") : not applicable for 'left_join' for

left_join R dataframes, merging two columns with NAs

阅读更多关于 left_join R dataframes, merging two columns with NAs

问题 My problem is the following: Lets say I have an existing dataframe with the following columns: UID, foo, result. Result is already partially filled. A second model now predicts additional rows, generating a second dataframe containing a UID and a result column: (Code to reproduce at bottom) ## df_main ## UID foo result ## <dbl> <chr> <chr> ## 1 1 moo Cow ## 2 2 rum <NA> ## 3 3 oink <NA> ## 4 4 woof Dog ## 5 5 hiss <NA> ## new_prediction ## UID result ## <dbl> <chr> ## 1 3 Pig ## 2 5 Snake I

left_join R dataframes, merging two columns with NAs

阅读更多关于 left_join R dataframes, merging two columns with NAs

Passing column name as parameter to a function using dplyr

阅读更多关于 Passing column name as parameter to a function using dplyr

问题 I have a dataframe like below : transid<-c(1,2,3,4,5,6,7,8) accountid<-c(a,a,b,a,b,b,a,b) month<-c(1,1,1,2,2,3,3,3) amount<-c(10,20,30,40,50,60,70,80) transactions<-data.frame(transid,accountid,month,amount) I am trying to write function for total monthly amount for each accountid using dplyr package verbs. my_sum<-function(df,col1,col2,col3){ df %>% group_by_(col1,col2) %>%summarise_(total_sum = sum(col3)) } my_sum(transactions, "accountid","month","amount") To get the result like below:

Passing column name as parameter to a function using dplyr

阅读更多关于 Passing column name as parameter to a function using dplyr

compute pointwise distance by group in R with sf dplyr

阅读更多关于 compute pointwise distance by group in R with sf dplyr

问题 I have 2 dataframes. I want to compute the distance between all POINT geometries if the first frame with respect to a certain POINT in the second dataframe. The main feature of this problem is that I have a grouping variable in the first dataframe, and I would like to select the corresponding point to measure the distance to (in the second dataframe) according to this grouping indicator. I tried with group_by : library(sf) library(dplyr) d = data.frame(x = 1:10,y = 1:10, g = rep(c("a","b")

compute pointwise distance by group in R with sf dplyr

阅读更多关于 compute pointwise distance by group in R with sf dplyr

Create t.test table with dplyr?

阅读更多关于 Create t.test table with dplyr?

问题 Suppose I have data that looks like this: set.seed(031915) myDF <- data.frame( Name= rep(c("A", "B"), times = c(10,10)), Group = rep(c("treatment", "control", "treatment", "control"), times = c(5,5,5,5)), X = c(rnorm(n=5,mean = .05, sd = .001), rnorm(n=5,mean = .02, sd = .001), rnorm(n=5,mean = .08, sd = .02), rnorm(n=5,mean = .03, sd = .02)) ) I want to create a t.test table with a row for "A" and one for "B" I can write my own function that does that: ttestbyName <- function(Name) { b <- t