matching

Algorithm for minimum vertex cover in Bipartite graph

天涯浪子 提交于 2020-08-08 09:20:41
问题 I am trying to figure out an algorithm for finding minimum vertex cover of a bipartite graph. I was thinking about a solution, that reduces the problem to maximum matching in bipartite graph. It's known that it can be found using max flow in networ created from the bip. graph. Max matching M should determine min. vertex cover C, but I can't cope with choosing the vertices to set C. Let's say bip. graph has parts X, Y and vertices that are endpoints of max matching edges are in set A, those

“Index Match” In R Studio (multiple columns, across rows)

六月ゝ 毕业季﹏ 提交于 2020-06-28 04:04:29
问题 I'm working with a fairly large data set (100k rows) and want to replicate the Excel Index Match function in R Studio. I'm looking for a way to create a new column that will pull a value from an existing column ("1995_Number"), if 3 values from three different columns from one year match three values from three columns from another year - independent of the rows , and create a new column ("1994_Number"). Dataframe as example: dat <- data.frame(`1994_Address` = c("1234 Road", "123 Road", "321

stringdist_join results in NAs

烂漫一生 提交于 2020-06-27 12:22:12
问题 i am experimenting with the stringdist package in order to make fuzzy joins and i run into a problem which i do not understand and fail to find an answer for. I want to join these 2 data tables with the "dl" method and it produces a NA, which i completely do not understand. Maybe one of you has an explanation for this. The code: library(fuzzyjoin) test1<-as.data.frame(test1<-c("techniker")) test2<-as.data.frame(test2<-c("technician")) setnames(test2,1,"label") setnames(test1,1,"label") x <-

stringdist_join results in NAs

隐身守侯 提交于 2020-06-27 12:21:43
问题 i am experimenting with the stringdist package in order to make fuzzy joins and i run into a problem which i do not understand and fail to find an answer for. I want to join these 2 data tables with the "dl" method and it produces a NA, which i completely do not understand. Maybe one of you has an explanation for this. The code: library(fuzzyjoin) test1<-as.data.frame(test1<-c("techniker")) test2<-as.data.frame(test2<-c("technician")) setnames(test2,1,"label") setnames(test1,1,"label") x <-

stringdist_join results in NAs

耗尽温柔 提交于 2020-06-27 12:21:30
问题 i am experimenting with the stringdist package in order to make fuzzy joins and i run into a problem which i do not understand and fail to find an answer for. I want to join these 2 data tables with the "dl" method and it produces a NA, which i completely do not understand. Maybe one of you has an explanation for this. The code: library(fuzzyjoin) test1<-as.data.frame(test1<-c("techniker")) test2<-as.data.frame(test2<-c("technician")) setnames(test2,1,"label") setnames(test1,1,"label") x <-

Copy approximate string matching from excel to another excel file using python

强颜欢笑 提交于 2020-05-31 04:01:25
问题 Hi I would like to ask on how to copy some of the row from one excel file to another excel file. By using python fuzzy matching method or ANY other feasible way, the entire row by according to the name is hope to be matched and copied into new excel file. Here is the input data from first excel file, there is 13 rows and 6 columns in total as shown below: -----------------------------------------------------|-----|-----|-----|-----|-----| | name | no1 | no2 | no3 | no4 | no5 | ---------------

“Perfect separation” error when using Matcher from pymatch (Propensity score matching)

六月ゝ 毕业季﹏ 提交于 2020-03-03 07:02:07
问题 I am trying to use the pymatch package but I keep getting the error Error: Perfect separation detected, results not available . I checked multiple times, my dataset is not equal. It contains 260k rows for Control and 50k for treatment and has different averages. I only have 5 variables, all integers or Floats rounded to 2 decimals. My goal is to match some treated customers to non-treated customers for further analysis based on propensity score matching. I already removed outliers as