问题:

I am trying to create an edgelist out of patent data of the form:

PatentID    InventorIDs    CoinventorIDs  1           A ; B           C,D,E ; F,G,H,C  2           J ; K ; L       M,O ; N ; P, Q

What I would like is the edgelist below showing the connections between inventors and patents. (the semicolons separate the coinventors associated with each primary inventor):

1  A  B 1  A  C 1  A  D 1  A  E 1  B  F 1  B  G 1  B  H 1  B  C 2  J  K 2  J  L 2  J  M 2  J  O 2  K  N 2  L  P 2  L  Q

Is there an easy way to do this with igraph in R?

回答1:

I'm confused by the edges going between the inventorIds. But, here is a kind of brute force function that you could just apply by row. There may be a way with igraph, it being a massive library, that is better, but once you have the data in an this form it should be simple to convert to an igraph data structure.

Note that this leaves out the edges between primary inventors.

## A function to make the edges for each row rowFunc <- function(row) {     tmp <- lapply(row[2:3], strsplit, '\\s*;\\s*')     tmp2 <- lapply(tmp[[2]], strsplit, ',')     do.call(rbind, mapply(cbind, row[[1]], unlist(tmp[[1]]), unlist(tmp2, recursive=FALSE))) }  ## Apply the function by row do.call(rbind, apply(dat, 1, rowFunc)) #      [,1] [,2] [,3] #  [1,] "1"  "A"  "C"  #  [2,] "1"  "A"  "D"  #  [3,] "1"  "A"  "E"  #  [4,] "1"  "B"  "F"  #  [5,] "1"  "B"  "G"  #  [6,] "1"  "B"  "H"  #  [7,] "1"  "B"  "C"  #  [8,] "2"  "J"  "M"  #  [9,] "2"  "J"  "O"  # [10,] "2"  "K"  "N"  # [11,] "2"  "L"  "P"  # [12,] "2"  "L"  " Q"

转载请标明出处:Creating an edgelist from Patent data in R

文章来源: Creating an edgelist from Patent data in R

标签

data

edg战队