Create Edge List From Ragged Data Frame in R (for network analysis)

自古美人都是妖i 提交于 2021-01-28 19:45:08

问题


I have a ragged data frame with each row as an occurrence in time of one or more entities, like so:

(time1) entitya entityf entityz
(time2) entityg entityh
(time3) entityo entityp entityk entityL
(time4) entityM

I want to create an edge list for network analysis from a subset of entities found in a second vector (nodelist). My problem is that I don't know:

1). How to subset only the entities in the nodelist. I was considering

datanew<- subset(dataold, dataold %in% nodelist)

but it doesn't work.

2). How to make ragged data frame into a two column edge list. In the above example, it would transform to:

entitya entityf
entitya entityz
entityz entityf
...

NO idea how to do this. Any help is really appreciated!


回答1:


Try this:

# read your data 

dat <- strsplit(readLines(textConnection("(time1) entitya entityf entityz
(time2) entityg entityh
(time3) entityo entityp entityk entityL
(time4) entityM")), " ")

# remove (time)

dat <- lapply(dat, `[`, -1)

# filter

nodelist <- c("entitya", "entityf", "entityz", "entityg", "entityh",
              "entityo", "entityp", "entityk")

dat <- lapply(dat, intersect, nodelist)

# create an edge matrix

t(do.call(cbind, lapply(dat[sapply(dat, length) >= 2], combn, 2)))

This last step might be a lot to digest, so here is a breakout:

  • sapply(dat, length) computes the lengths of your list elements
  • dat[... >= 2] only keeps the list elements with at least two items
  • lapply(..., combn, 2) creates all combinations: a list of wide matrices
  • do.call(cbind, ...) binds all the combinations into a wide matrix
  • t(...) transposes into a tall matrix


来源:https://stackoverflow.com/questions/13782132/create-edge-list-from-ragged-data-frame-in-r-for-network-analysis

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!