Scanner without delimiter

青春壹個敷衍的年華 提交于 2020-02-24 04:35:06

问题


I would like to be able to parse strings like the following: "123456abcd9876az45678". The BNF is like this:

number: ? definition of an int ?
word: letter { , letter }
expression: number { , word , number }

However the class java.util.scanner doesn't allow me to do the following:

Scanner s = new Scanner("-123456abcd9876az45678");
System.out.println(s.nextInt());
while (s.hasNext("[a-z]+")) {
    System.out.println(s.next("[a-z]+"));
    System.out.println(s.nextInt());
}

Ideally, this should yield:

-123456
abcd
987
az
45678

I was really hoping that java.util.Scanner would help me, but it looks like I will have to create my own scanner. Is there anything already present in the Java API to help me?


The question miss too much information. And therefore all answers are valid to the question but not to my problem.


回答1:


Unfortunately you cannot use no delimiters with the Scanner class AFAIK. If you wish to ignore delimiters, you'd need to use the methods that does so such as findInLine() or findWithinHorizon(). In your case, findWithinHorizion() would be appropriate.

Scanner s = new Scanner("-123456abcd9876az45678");
Pattern num = Pattern.compile("[+-]?\\d+");
Pattern letters = Pattern.compile("[A-Za-z]+");
System.out.println(s.findWithinHorizon(num, 0));
String str;
while ((str = s.findWithinHorizon(letters, 0)) != null) {
    System.out.println(str);
    System.out.println(s.findWithinHorizon(num, 0));
}



回答2:


To use the scanner as a tokenizer, use findWithinHorizon with \G to scan from the group start (= current position) only.

Example supporting whitespace (as requested in the comments):

Scanner scanner = new Scanner(input);
while (true) {
  String letters = scanner.findWithinHorizon("\\G\\s*\\[a-zA-Z]+", 0);
  if (letters != null) {
    System.out.println("letters: " + letters.trim());
  } else {
    String number = scanner.findWithinHorizon("\\G\\s[+-]?[0-9]+", 0);
    if (number != null) {
      System.out.println("number: " + number.trim());
    } else if (scanner.findWithinHorizon("\\G\\s*\\Z", 0) != null) {
      System.out.println("end");
      break;
    } else {
      System.out.println("unrecognized input");
      break;
    }
  }
}

In real applications, you probably should compile the patterns upfront.




回答3:


You can achieve this using the Pattern and Matcher classes. See this example.




回答4:


You could set the delimiter to a pattern that can't match anything, e.g.

Scanner s = ...
s.useDelimiter("(?!=a)a");


来源:https://stackoverflow.com/questions/4798902/scanner-without-delimiter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!