How to backreference in Ruby regular expression (regex) with gsub when I use grouping?

你说的曾经没有我的故事 提交于 2019-12-29 03:53:06

问题


I would like to patch some text data extracted from web pages. sample:

t="First sentence. Second sentence.Third sentence."

There is no space after the point at the end of the second sentence. This sign me that the 3rd sentence was in a separate line (after a br tag) in the original document.

I want to use this regexp to insert "\n" character into the proper places and patch my text. My regex:

t2=t.gsub(/([.\!?])([A-Z1-9])/,$1+"\n"+$2)

But unfortunately it doesn't work: "NoMethodError: undefined method `+' for nil:NilClass" How can I properly backreference to the matched groups? It was so easy in Microsoft Word, I just had to use \1 and \2 symbols.


回答1:


You can backreference in the substitution string with \1 (to match capture group 1).

t = "First sentence. Second sentence.Third sentence!Fourth sentence?Fifth sentence."
t.gsub(/([.!?])([A-Z1-9])/, "\\1\n\\2") # => "First sentence. Second sentence.\nThird sentence!\nFourth sentence?\nFifth sentence."



回答2:


  • If you are using gsub(regex, replacement), then use '\1', '\2', ... to refer to the match. Make sure not to put double quotes around the replacement, or else escape the backslash as in Joshua's answer. The conversion from '\1' to the match will be done within gsub, not by literal interpretation.
  • If you are using gsub(regex){replacement}, then use $1, $1, ...

But for your case, it is easier not to use matches:

t2 = t.gsub(/(?<=[.\!?])(?=[A-Z1-9])/, "\n")



回答3:


If you got here because of Rubocop complaining "Avoid the use of Perl-style backrefs." about $1, $2, etc... you can can do this instead:

some_id = $1
# or
some_id = Regexp.last_match[1] if Regexp.last_match

some_id = $5
# or
some_id = Regexp.last_match[5] if Regexp.last_match

It'll also want you to do

%r{//}.match(some_string)

instead of

some_string[//]

Lame (Rubocop)



来源:https://stackoverflow.com/questions/12065707/how-to-backreference-in-ruby-regular-expression-regex-with-gsub-when-i-use-gro

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!