in R: find all unique values in column separated by comma

北城以北 提交于 2021-02-13 17:27:04

问题


I have multiple observations of one species with different observers / groups of observers and want to create a list of all unique observers. My data look like this:

data <- read.table(text="species observer
1 A,B
1 A,B
1 B,E
1 B,E
1 D,E,A,C,C
1 F"               , header = TRUE, stringsAsFactors = FALSE)

My output should return a list of all unique observers - so:

A,B,C,E,F

I tried to substring the data in column C using the following command but that only returns the unique combinations of observers.

all_observers <- unique(strsplit(as.character(data$observer), ","))

all_observers
[[1]]
[1] "A" "B"

[[2]]
[1] "B" "E"

[[3]]
[1] "D" "E" "A" "C" "C"

[[4]]
[1] "F"

回答1:


You're almost there, you just need to unlist before you do the unique:

all_observers <- unique(unlist(strsplit(as.character(data$observer), ",")))



回答2:


We can use separate_rows on the 'observer', get the distinct rows, grouped by 'species', and paste the 'observer'

library(tidyverse)
data %>% 
   separate_rows(observer) %>% 
   distinct %>% 
   group_by(species) %>% 
   summarise(observer = toString(observer))



回答3:


You could also use scan()

unique(scan(text=data$observer, what="", sep=","))
# Read 14 items
# [1] "A" "B" "E" "D" "C" "F"


来源:https://stackoverflow.com/questions/54078407/in-r-find-all-unique-values-in-column-separated-by-comma

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!