问题
I am currently building a small text editor for a custom file format. I have a GUI, but I also implemented a small output console. What I want to achieve is to add a very basic input field to execute some commands and pass parameters. A command would look like :
compile test.json output.bin -location "Paris, France" -author "Charles \"Demurgos\""
My problem is to get an array containing the space-separated arguments, but preserving the double quoted parts which might be a string generated by JSON.stringify containing escaped double-quotes inside.
To be clear, the expected array for the previous command is :
[
'compile',
'test.json',
'output.bin',
'-location',
'"Paris, France"',
'-author',
'"Charles \\"Demurgos\\""'
]
Then I can iterate over this array and apply a JSON.parse if indexOf('"') == 0 to get the final result :
[
'compile',
'test.json',
'output.bin',
'-location',
'Paris, France',
'-author',
'Charles "Demurgos"'
]
Thanks to this question : Split a string by commas but ignore commas within double-quotes using Javascript . I was able to get what I need if the arguments do NOT contain any double-quotes. Here is the regex i got :
/(".*?"|[^"\s]+)(?=\s*|\s*$)/g
But it exits the current parameter when it encounters a double-quote, even if it is escaped. How can I adapt this RegEx to take care about the escaped or not double quotes ? And what about edge cases if I prompt action "windowsDirectory\\" otherArg, here the backslash is already escaped so even if it's followed by a double quote, it should exit the argument.
This a problem I was trying to avoid as long as possible during previous projects, but I feel it's time for me to learn how to properly take under-account escape characters.
Here is a JS-Fiddle : http://jsfiddle.net/GwY8Y/1/ You can see that the beginning is well-parsed but the last arguments is split and bugs.
Thank you for any help.
回答1:
This regex will give you the strings you need (see demo):
"(?:\\"|\\\\|[^"])*"|\S+
Use it like this:
your_array = subject.match(/"(?:\\"|\\\\|[^"])*"|\S+/g);
Explain Regex
" # '"'
(?: # group, but do not capture (0 or more times
# (matching the most amount possible)):
\\ # '\'
" # '"'
| # OR
\\\\ # two backslashes
| # OR
[^"] # any character except: '"'
)* # end of grouping
" # '"'
| # OR
\S+ # non-whitespace (all but \n, \r, \t, \f,
# and " ") (1 or more times (matching the
# most amount possible))
来源:https://stackoverflow.com/questions/24069344/split-spaces-avoiding-double-quoted-js-strings-from-a-b-c-d-to-a