Java regex (java.util.regex). Search for dollar sign

不羁的心 提交于 2019-12-07 13:20:25

问题


I have a search string. When it contains a dollar symbol, I want to capture all characters thereafter, but not include the dot, or a subsequent dollar symbol.. The latter would constitute a subsequent match. So for either of these search strings...:

"/bla/$V_N.$XYZ.bla";
"/bla/$V_N.$XYZ;

I would want to return:

  • V_N
  • XYZ

If the search string contains percent symbols, I also want to return what's between the pair of % symbols.

The following regex seems do the trick for that.

 "%([^%]*?)%";

Inferring:

  • Start and end with a %,
  • Have a capture group - the ()
  • have a character class containing anything except a % symbol, (caret infers not a character)
  • repeated - but not greedily *?

Where some languages allow %1, %2, for capture groups, Java uses backslash\number syntax instead. So, this string compiles and generates output.

I suspect the dollar symbol and dot need escaping, as they are special symbols:

  • $ is usually end of string
  • . is a meta sequence for any character.

I have tried using double backslash symbols.. \

  • Both as character classes .e.g. [^\\.\\$%]
  • and using OR'd notation %|\\$

in attempts to combine this logic and can't seem to get anything to play ball.

I wonder if another pair of eyes can see how to solve this conundrum!

My attempts so far:

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main {
  public static void main(String[] args) {
        String search = "/bla/$V_N.$XYZ.bla";
        String pattern = "([%\\$])([^%\\.\\$]*?)\\1?";
  /* Either % or $ in first capture group ([%\\$])
   * Second capture group - anything except %, dot or dollar sign
   * non greedy group ( *?)
   * then a backreference to an optional first capture group \\1?
   * Have to use two \, since you escape \ in a Java string.
   */
        Pattern r = Pattern.compile(pattern);
        Matcher m = r.matcher(search);
        List<String> results = new ArrayList<String>();
          while (m.find()) 
        { 
          for (int i = 0; i<= m.groupCount(); i++) {
                results.add(m.group(i));
          }
        }
        for (String result : results) {
          System.out.println(result);
        }
  }
}

The following links may be helpful:

  • An interactive Java playground where you can experiment and copy/paste code.
  • Regex101
  • Java RegexTester
  • Java backreferences (The optional backreference \\1 in the Regex).
  • Link that summarises Regex special characters often found in languages
  • Java Regex book EPub link
  • Regex Info Website
  • Matcher class in the Javadocs

回答1:


You may use

String search = "/bla/$V_N.$XYZ.bla";
String pattern = "[%$]([^%.$]*)";
Matcher matcher = Pattern.compile(pattern).matcher(search);
while (matcher.find()){
    System.out.println(matcher.group(1)); 
} // => V_N, XYZ

See the Java demo and the regex demo.

NOTE

  • You do not need an optional \1? at the end of the pattern. As it is optional, it does not restrict match context and is redundant (as the negated character class cannot already match neither $ nor%)
  • [%$]([^%.$]*) matches % or $, then captures into Group 1 any zero or more chars other than %, . and $. You only need Group 1 value, hence, matcher.group(1) is used.
  • In a character class, neither . nor $ are special, thus, they do not need escaping in [%.$] or [%$].


来源:https://stackoverflow.com/questions/58821727/java-regex-java-util-regex-search-for-dollar-sign

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!