I\'m using R and I have a data.frame with nearly 2,000 entries that looks as follows:
> head(PVs,15)
LogFreq Word PhonCV FreqDev
1593 140 w
Method 1
You can use grepl
with an appropraite regular expression. Consider the following:
x <- c("blank","wade","waste","rubbish","dedekind","bated")
grepl("^.+(de|te)$",x)
[1] FALSE TRUE TRUE FALSE FALSE FALSE
The regular expression says begin (^
) with anything any number of times (.+
) and then find either de or te ((de|te)
) then end ($
).
So for your data.frame try,
subset(PVs,grepl("^.+(de|te)$",Word))
Method 2
To avoid the regexp method you can use a substr
method instead.
# substr the last two characters and test
substr(x,nchar(x)-1,nchar(x)) %in% c("de","te")
[1] FALSE TRUE TRUE FALSE FALSE FALSE
So try:
subset(PVs,substr(Word,nchar(Word)-1,nchar(Word)) %in% c("de","te"))