Built Family nested tree parent / children relationship in R

余生颓废 提交于 2019-12-05 17:14:49

We build a recursive function to get the father line, from there everything is easy.

First we define the data with stringsAsFactors = FALSE for smoother reformatting.

family <- data.frame(person, father,stringsAsFactors = FALSE)

the function

father_line <- function(x){
dad <- subset(family,person==x)$father
if(is.na(dad)) return(x)
c(x,father_line(dad))
}

father_line ("Guillou Alan")
# [1] "Guillou Alan"   "Guillou Eric"   "Guillou Arthur"

Use it to get level and other things

family$father_line <- lapply(family$person,father_line)
family$level       <- lengths(family$father_line)
family$patriarch   <- sapply(family$father_line,tail,1)

#             person         father                                          father_line level      patriarch
# 1   Guillou Arthur           <NA>                                       Guillou Arthur     1 Guillou Arthur
# 2      Cleach Marc           <NA>                                          Cleach Marc     1    Cleach Marc
# 3     Guillou Eric Guillou Arthur                         Guillou Eric, Guillou Arthur     2 Guillou Arthur
# 4  Guillou Jacques Guillou Arthur                      Guillou Jacques, Guillou Arthur     2 Guillou Arthur
# 5    Cleach Franck    Cleach Marc                           Cleach Franck, Cleach Marc     2    Cleach Marc
# 6       Cleach Leo    Cleach Marc                              Cleach Leo, Cleach Marc     2    Cleach Marc
# 7    Cleach Herbet     Cleach Leo               Cleach Herbet, Cleach Leo, Cleach Marc     3    Cleach Marc
# 8     Cleach Adele  Cleach Herbet Cleach Adele, Cleach Herbet, Cleach Leo, Cleach Marc     4    Cleach Marc
# 9     Guillou Jean   Guillou Eric           Guillou Jean, Guillou Eric, Guillou Arthur     3 Guillou Arthur
# 10    Guillou Alan   Guillou Eric           Guillou Alan, Guillou Eric, Guillou Arthur     3 Guillou Arthur

For example to get stated expected output:

subset(family,patriarch == "Guillou Arthur",select=c(person,father,level))
#             person         father level
# 1   Guillou Arthur           <NA>     1
# 3     Guillou Eric Guillou Arthur     2
# 4  Guillou Jacques Guillou Arthur     2
# 9     Guillou Jean   Guillou Eric     3
# 10    Guillou Alan   Guillou Eric     3 

The tidyverse way it would look like this:

library(tidyverse)
family %>%
  mutate(family_line = map(person,father_line),
         level = lengths(family_line),
         patriarch = map(family_line,last)) %>%
  filter(patriarch == "Guillou Arthur") %>%
  select(person,father,level)

#            person         father level
# 1  Guillou Arthur           <NA>     1
# 2    Guillou Eric Guillou Arthur     2
# 3 Guillou Jacques Guillou Arthur     2
# 4    Guillou Jean   Guillou Eric     3
# 5    Guillou Alan   Guillou Eric     3

You can probably do this using graph tools. so using igraph, you can get neighbours using ego functions.

A quick sketch (which needs checking!)

library(igraph)

family[] = lapply(family, factor, levels=unique(unlist(family)))

g = graph_from_adjacency_matrix(table(family))

cg = connect.neighborhood(g, order=length(V(g)), mode="out")

cbind( V(cg)$name, 
       sapply(ego(g, mode="out", mindist=1), function(x) replace(names(x), length(names(x))==0, NA)),
       ego_size(cg, mode="out") )[grep("Guillou", V(cg)$name),]

[,1]                   [,2]             [,3]
[1,] "Guillou Arthur"  NA               "1" 
[2,] "Guillou Eric"    "Guillou Arthur" "2" 
[3,] "Guillou Jacques" "Guillou Arthur" "2" 
[4,] "Guillou Jean"    "Guillou Eric"   "3" 
[5,] "Guillou Alan"    "Guillou Eric"   "3"

In fact maybe you dont need to create a neighbourhood graph and can get by with:

cbind( V(g)$name, 
       sapply(ego(g, mode="out", mindist=1), function(x) replace(names(x), length(names(x))==0, NA)),
       ego_size(g, mode="out", order=length(V(g))) )[grep("Cleach", V(g)$name),]
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!