I want to extract a part of the string that comes before a certain word. E.g. I want to get everything before \", useless\".
a <- \"Experiment A, useless (03/
We can use sub to match the , followed by zero or more spaces (\\s*) followed by 'useless' and other characters that follow (.*) and replace it with blank ("")
sub(",\\s*useless\\b.*", "", a)
#[1] "Experiment A"
sub('(.*),.*','\\1', a, perl=T)
[1] "Experiment A"
Lookahead is made for this:
b <- regexpr(".*(?=, useless)", a, perl=TRUE)
regmatches(a, b)
## [1] "Experiment A"
.* matches any sequence of characters, but the lookahead (?=, useless) says that it only matches text that is followed by the string ", useless".
sub("(\\w*), useless.*","\\1",a)