Creating an edgelist from Patent data in R

匿名 (未验证) 提交于 2019-12-03 01:45:01

问题:

I am trying to create an edgelist out of patent data of the form:

PatentID    InventorIDs    CoinventorIDs  1           A ; B           C,D,E ; F,G,H,C  2           J ; K ; L       M,O ; N ; P, Q

What I would like is the edgelist below showing the connections between inventors and patents. (the semicolons separate the coinventors associated with each primary inventor):

1  A  B 1  A  C 1  A  D 1  A  E 1  B  F 1  B  G 1  B  H 1  B  C 2  J  K 2  J  L 2  J  M 2  J  O 2  K  N 2  L  P 2  L  Q

Is there an easy way to do this with igraph in R?

回答1:

I'm confused by the edges going between the inventorIds. But, here is a kind of brute force function that you could just apply by row. There may be a way with igraph, it being a massive library, that is better, but once you have the data in an this form it should be simple to convert to an igraph data structure.

Note that this leaves out the edges between primary inventors.

## A function to make the edges for each row rowFunc <- function(row) {     tmp <- lapply(row[2:3], strsplit, '\\s*;\\s*')     tmp2 <- lapply(tmp[[2]], strsplit, ',')     do.call(rbind, mapply(cbind, row[[1]], unlist(tmp[[1]]), unlist(tmp2, recursive=FALSE))) }  ## Apply the function by row do.call(rbind, apply(dat, 1, rowFunc)) #      [,1] [,2] [,3] #  [1,] "1"  "A"  "C"  #  [2,] "1"  "A"  "D"  #  [3,] "1"  "A"  "E"  #  [4,] "1"  "B"  "F"  #  [5,] "1"  "B"  "G"  #  [6,] "1"  "B"  "H"  #  [7,] "1"  "B"  "C"  #  [8,] "2"  "J"  "M"  #  [9,] "2"  "J"  "O"  # [10,] "2"  "K"  "N"  # [11,] "2"  "L"  "P"  # [12,] "2"  "L"  " Q"


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!