I have a string in R as
x <- \"The length of the word is going to be of nice use to me\"
I want the first 10 words of the above specifi
Regular expression (regex) answer using \w (word character) and its negation \W:
gsub("^((\\w+\\W+){9}\\w+).*$","\\1",x)
^ Beginning of the token (zero-width)((\\w+\\W+){9}\\w+) Ten words separated by not-words.
(\\w+\\W+){9} A word followed by not-a-word, 9 times
\\w+ One or more word characters (i.e. a word)\\W+ One or more non-word characters (i.e. a space){9} Nine repetitions\\w+ The tenth word.* Anything else, including other following words$ End of the token (zero-width)\\1 when this token found, replace it with the first captured group (the 10 words)