问题
I have a character vector, including some elements that are duplicates e.g.
v <- c("d09", "d11", "d13", "d01", "d02", "d10", "d13")
And another vector that includes single counts of those characters e.g.
x <- c("d10", "d11", "d13")
I want to remove only the first occurrence of each element in x
from the 2nd vector v
. In this example, d13
occurs in x
and twice in v
, but only the first match is removed from v
and the duplicate is kept. Thus, I want to end up with:
"d09", "d01", "d02", "d13"
I've been trying various things e.g. z <- v[!(v %in% x)]
but it keeps removing all instances of the characters in x
, not just the first, so I end up with this instead:
"d09", "d01", "d02"
What can I do to only remove one instance of a duplicated element?
回答1:
You can use match
and negative indexing.
v[-match(x, v)]
produces
[1] "d09" "d01" "d02" "d13"
match
only returns the location of the first match of a value, which we use to our advantage here.
Note that %in%
and is.element
are degenerate versions of match
. Compare:
match(x, v) # [1] 6 2 3
match(x, v) > 0 # [1] TRUE TRUE TRUE
x %in% v # [1] TRUE TRUE TRUE
is.element(x, v) # [1] TRUE TRUE TRUE
The last three are all the same, and are basically the coerced to logical version of the first (in fact, see code for %in%
and is.element
). In doing so you lose key information, which is the location of the first match of x
in v
and are left only knowing that x
values exist in v
.
The converse, v %in% x
means something different from what you want, which is "which values in v
are in x
", which won't meet your requirement since all duplicate values will satisfy that condition.
来源:https://stackoverflow.com/questions/30129684/remove-first-occurrence-of-elements-in-a-vector-from-another-vector