How to create new columns in a data.frame based on row values in R?

邮差的信 提交于 2021-01-28 19:25:46

问题


Hej,

I have a data.frame with family trios, and I would like to add a column with the full sibs of every "id" (= offspring).

My data:

df
         id    dam    sire
1:    83295  67606   79199
2:    83297  67606   79199
3:    89826  67606   79199

What I would like to retrieve:

df2
         id    dam    sire     fs1     fs2
1:    83295  67606   79199   83297   89826  
2:    83297  67606   79199   83295   89826  
3:    89826  67606   79199   83295   83297  

What I’ve tried:

(similar to: How to transform a dataframes row into columns in R?)

library(dplyr)
library(splitstackshape)

df2 <- df %>%
  group_by(dam,sire) %>%
  summarise(id = toString(id)) %>%
  cSplit("id") %>%
  setNames(paste0("fs_", 1:ncol(.)))

colnames(df2) <- c("dam", "sire", "id", "fs1", "fs2")

Which only gives me one row per parent duo (instead of creating the same row per every "id"):

df2
     dam    sire       id      fs1     fs2
1: 67606   79199    83295    83297    89826  

In some cases there will be no full sibs, and in some cases there will be 15.

Thanks in advance for your advice! :)


回答1:


We can group_by dam and sire get all id's except current id using setdiff and then use cSplit to separate comma-separated values into different columns.

library(splitstackshape)
library(dplyr)

df %>%
  group_by(dam, sire) %>%
  mutate(fs = purrr::map_chr(id, ~toString(setdiff(id, .x)))) %>%
  cSplit("fs")

#      id   dam  sire  fs_1  fs_2
#1: 83295 67606 79199 83297 89826
#2: 83297 67606 79199 83295 89826
#3: 89826 67606 79199 83295 83297

data

df <- structure(list(id = c(83295L, 83297L, 89826L), dam = c(67606L, 
67606L, 67606L), sire = c(79199L, 79199L, 79199L)), class = "data.frame",
row.names = c("1:", "2:", "3:"))


来源:https://stackoverflow.com/questions/58906178/how-to-create-new-columns-in-a-data-frame-based-on-row-values-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!