Simple Java regex not working

坚强是说给别人听的谎言 提交于 2019-12-08 19:26:52

问题


I have this regex which is supposed to remove sentence delimiters(. and ?):

sentence = sentence.replaceAll("\\.|\\?$","");

It works fine it converts

"I am Java developer." to "I am Java developer"

"Am I a Java developer?" to "Am I a Java developer"

But after deployment we found that it also replaces any other dots in the sentence as

"Hi.Am I a Java developer?" becomes "HiAm I a Java developer"

Why is this happening?


回答1:


The pipe (|) has the lowest precedence of all operators. So your regex:

\\.|\\?$

is being treated as:

(\\.)|(\\?$)

which matches a . anywhere in the string and matches a ? at the end of the string.

To fix this you need to group the . and ? together as:

(?:\\.|\\?)$

You could also use:

[.?]$

Within a character class . and ? are treated literally so you need not escape them.




回答2:


What you're saying with "\\.|\\?$" is "either a period" or "a question mark as the last character".

I would recommend "[.?]$" instead in order to avoid the confusing escaping (and undesirable result, of course).




回答3:


Your problem is because of the low precedence of the alternation operator |. Your regular expression means match one of:

  • . anywhere or
  • ? at the end of a line.

Use a character class instead:

"[.?]$"



回答4:


You have forgotten to embrace the sentence-ending characters with round brackets:

sentence = sentence.replaceAll("(\\.|\\?)$","");

The better approach is to use [.?]$ like @Mark Byers suggested.

sentence = sentence.replaceAll("[.?]$","");


来源:https://stackoverflow.com/questions/4041266/simple-java-regex-not-working

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!