Replace substring every >n characters (conditionally insert linebreaks for spaces)

妖精的绣舞 提交于 2019-12-10 04:33:23

问题


I would like to replace spaces with linebreaks (\n) in a pretty long chracter vector in R. However, I don't want to replace every space, but only if the substring exeeds a certain number of characters (n).

Example:

mystring <- "this string is annoyingly long and therefore I would like to insert linebreaks" 

Now I want to insert linebreaks in mystring at every space on the condition that each substring has a length greater than 20 characters (nchar > 20).

Hence, the resulting string is supposed to look like this:

"this string is annoyingly\nlong and therefore I would\nlike to insert linebreaks") 

Linebreaks (\n) were inserted after 25, 26 and 25 characters.

How can I achieve this? Maybe something combining gsub and strsplit?


回答1:


You may use .{21,}?\s regex to match any 21 (since nchar > 20) chars or more, but as few as possible, up to the nearest whitespace:

> gsub("(.{21,}?)\\s", "\\1\n", mystring)
[1] "this string is annoyingly\nlong and therefore I would\nlike to insert linebreaks"

Details:

  • (.{21,}?) - Group 1 capturing any 21 chars or more, but as few as possible (as {21,}? is a lazy quantifier)
  • \\s - a whitespace

The replacement contains the backreference to Group 1 to reinsert the text before the whitespace, and the newline char (feel free to add CR, too, if needed).



来源:https://stackoverflow.com/questions/41102417/replace-substring-every-n-characters-conditionally-insert-linebreaks-for-space

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!