问题
I have a dataframe like so:
data = read.table(text = "region plot species
1 1A A_B
1 1A A_B
1 1B B_C
1 1C A_B
1 1D C_D
2 2A B_C
2 2A B_C
2 2A E_F
2 2B B_C
2 2B E_F
2 2C E_F
2 2D B_C
3 3A A_B
3 3B A_B", stringsAsFactors = FALSE, header = TRUE)
I wanted to compare each level of plot
to get a count of unique species
matches among two plot comparisons. However, I do not want comparisons among the same plots (i.e. remove/do not include 1A_1A or 1B_1B or 2C_2C, ect.). The output for this example should appear as follows:
output<-
region plot freq
1 1A_1B 0
1 1A_1C 1
1 1A_1D 0
1 1B_1C 0
1 1B_1D 0
1 1C_1D 0
2 2A_2B 2
2 2A_2C 1
2 2A_2D 1
2 2B_2C 1
2 2B_2D 1
2 2C_2D 0
3 3A_3B 1
I have adapted the following code from @HubertL, Convert list of matrices to a single data frame but struggle to incorporate an appropriate if else statement to meet this condition:
library(tidyverse)
data %>% group_by(region, species) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(ifelse(plot[i] = plot[i], freq =
length(unique((species),)
回答1:
You can filter out duplicates by adding filter(!duplicated(plot))
:
data %>% group_by(region, species) %>%
filter(!duplicated(plot)) %>%
filter(n() > 1) %>%
summarize(y = list(combn(plot, 2, paste, collapse="_"))) %>%
unnest %>%
group_by(region, y) %>%
summarize(freq=n())
region y freq
<int> <chr> <int>
1 1 1A_1C 1
2 2 2A_2B 2
3 2 2A_2C 1
4 2 2A_2D 1
5 2 2B_2C 1
6 2 2B_2D 1
7 3 3A_3B 1
来源:https://stackoverflow.com/questions/44808793/conditional-statement-in-dplyr-tidyverse-function-to-exclude-comparisons-among-s