问题
What's the recommended way to parse a shell-like command line in Java. By that I don't mean processing the options when they are already in array form (e.g. handling "-x" and such), there are loads of questions and answers about that already.
No, I mean the splitting of a full command string into "tokens". I need to convert a string such as:
user 123712378 suspend "They are \"bad guys\"" Or\ are\ they?
...to the list/array:
user
123712378
suspend
They are "bad guys"
Or are they?
I'm currently just doing a split on whitespace, but that obviously can't handle the quotes and escaped spaces.
(Quote handling is most important. Escaped spaces would be nice-to-have)
Note: My command string is the input from a shell-like web interface. It's not built from main(String[] args)
回答1:
What you would need is to implement a finite automaton. You would need to read the string character by character and find the next state depending on your next or previous character.
For example a "
indicates start of a string but if it is preceded by an \
leaves the current state unchanged and reads until the next token that takes you to the next state.
I.e. essentially in your example you would have
read string -> read number
^ - - - |
You of course would need to define all the states and the special characters that affect or not affect your state.
To be honest I am not sure why you would want to provide such functionality to the end user.
Traditionally all the cli programs accept input in a standard format -x or --x or --x=s
etc.
This format is well known to a typical user and is simple to implement and test as correct.
Traditionally if we are required to provide more "flexible" input for the user, it is best to build a GUI. That is what I would suggest.
回答2:
ArgumentTokenizer from DrJava parses command line in a way Bourne shell and its derivatives do.
It properly supports escapes, so bash -c 'echo "\"escaped '\''single'\'' quote\""'
gets tokenized into [bash, -c, echo "\"escaped 'single' quote\""]
.
回答3:
Build the args[] back into a string, then tokenize using regexp:
public static void main(String[] args) {
String commandline = "";
for(String arg : args) {
commandline += arg;
commandline += " ";
}
System.out.println(commandline);
List<String> list = new ArrayList<String>();
Matcher m = Pattern.compile("([^\"]\\S*|\".+?\")\\s*").matcher(commandline);
while (m.find())
list.add(m.group(1)); // Add .replace("\"", "") to remove surrounding quotes.
System.out.println(list);
}
The latter part I took from here.
来源:https://stackoverflow.com/questions/16722259/splitting-a-command-line-in-java