I have a SQL table that maps, say, authors and books. I would like to group linked authors and books (books written by the same author, and authors who co-wrote a book) toge
Converting 500K nodes into an adjacency matrix was too much for my computer's memory, so I couldn't use igraph. The RBGL package isn't updated for R version 2.15.1, so that was out as well.
After writing a lot of dumb code that doesn't seem to work, I think the following gets me to the right answer.
aubk[,grp := author_id]
num.grp.old <- aubk[,length(unique(grp))]
iterations <- 0
repeat {
aubk[,grp := min(grp),by=author_id]
aubk[,grp := min(grp), by=book_id]
num.grp.new <- aubk[,length(unique(grp))]
if(num.grp.new == num.grp.old) {break}
num.grp.old <- num.grp.new
iterations <- iterations + 1
}