问题
I would like to replace spaces with linebreaks (\n) in a pretty long chracter vector in R. However, I don't want to replace every space, but only if the substring exeeds a certain number of characters (n).
Example:
mystring <- "this string is annoyingly long and therefore I would like to insert linebreaks"
Now I want to insert linebreaks in mystring at every space on the condition that each substring has a length greater than 20 characters (nchar > 20).
Hence, the resulting string is supposed to look like this:
"this string is annoyingly\nlong and therefore I would\nlike to insert linebreaks")
Linebreaks (\n) were inserted after 25, 26 and 25 characters.
How can I achieve this?
Maybe something combining gsub and strsplit?
回答1:
You may use .{21,}?\s regex to match any 21 (since nchar > 20) chars or more, but as few as possible, up to the nearest whitespace:
> gsub("(.{21,}?)\\s", "\\1\n", mystring)
[1] "this string is annoyingly\nlong and therefore I would\nlike to insert linebreaks"
Details:
(.{21,}?)- Group 1 capturing any 21 chars or more, but as few as possible (as{21,}?is a lazy quantifier)\\s- a whitespace
The replacement contains the backreference to Group 1 to reinsert the text before the whitespace, and the newline char (feel free to add CR, too, if needed).
来源:https://stackoverflow.com/questions/41102417/replace-substring-every-n-characters-conditionally-insert-linebreaks-for-space