问题
I would like to be able to parse strings like the following: "123456abcd9876az45678". The BNF is like this:
number: ? definition of an int ?
word: letter { , letter }
expression: number { , word , number }
However the class java.util.scanner doesn't allow me to do the following:
Scanner s = new Scanner("-123456abcd9876az45678");
System.out.println(s.nextInt());
while (s.hasNext("[a-z]+")) {
System.out.println(s.next("[a-z]+"));
System.out.println(s.nextInt());
}
Ideally, this should yield:
-123456
abcd
987
az
45678
I was really hoping that java.util.Scanner would help me, but it looks like I will have to create my own scanner. Is there anything already present in the Java API to help me?
The question miss too much information. And therefore all answers are valid to the question but not to my problem.
回答1:
Unfortunately you cannot use no delimiters with the Scanner class AFAIK. If you wish to ignore delimiters, you'd need to use the methods that does so such as findInLine()
or findWithinHorizon()
. In your case, findWithinHorizion()
would be appropriate.
Scanner s = new Scanner("-123456abcd9876az45678");
Pattern num = Pattern.compile("[+-]?\\d+");
Pattern letters = Pattern.compile("[A-Za-z]+");
System.out.println(s.findWithinHorizon(num, 0));
String str;
while ((str = s.findWithinHorizon(letters, 0)) != null) {
System.out.println(str);
System.out.println(s.findWithinHorizon(num, 0));
}
回答2:
To use the scanner as a tokenizer, use findWithinHorizon
with \G
to scan from the group start (= current position) only.
Example supporting whitespace (as requested in the comments):
Scanner scanner = new Scanner(input);
while (true) {
String letters = scanner.findWithinHorizon("\\G\\s*\\[a-zA-Z]+", 0);
if (letters != null) {
System.out.println("letters: " + letters.trim());
} else {
String number = scanner.findWithinHorizon("\\G\\s[+-]?[0-9]+", 0);
if (number != null) {
System.out.println("number: " + number.trim());
} else if (scanner.findWithinHorizon("\\G\\s*\\Z", 0) != null) {
System.out.println("end");
break;
} else {
System.out.println("unrecognized input");
break;
}
}
}
In real applications, you probably should compile the patterns upfront.
回答3:
You can achieve this using the Pattern and Matcher classes. See this example.
回答4:
You could set the delimiter to a pattern that can't match anything, e.g.
Scanner s = ...
s.useDelimiter("(?!=a)a");
来源:https://stackoverflow.com/questions/4798902/scanner-without-delimiter