Split spaces avoiding double-quoted JS strings : from 'a “b \\” c“ d ' to ['a','”b \\“ c”','d']

假如想象 提交于 2019-12-11 02:05:22

问题


I am currently building a small text editor for a custom file format. I have a GUI, but I also implemented a small output console. What I want to achieve is to add a very basic input field to execute some commands and pass parameters. A command would look like :

compile test.json output.bin -location "Paris, France" -author "Charles \"Demurgos\""

My problem is to get an array containing the space-separated arguments, but preserving the double quoted parts which might be a string generated by JSON.stringify containing escaped double-quotes inside.

To be clear, the expected array for the previous command is :

[
    'compile',
    'test.json',
    'output.bin',
    '-location',
    '"Paris, France"',
    '-author',
    '"Charles \\"Demurgos\\""'
]

Then I can iterate over this array and apply a JSON.parse if indexOf('"') == 0 to get the final result :

[
    'compile',
    'test.json',
    'output.bin',
    '-location',
    'Paris, France',
    '-author',
    'Charles "Demurgos"'
]

Thanks to this question : Split a string by commas but ignore commas within double-quotes using Javascript . I was able to get what I need if the arguments do NOT contain any double-quotes. Here is the regex i got :

/(".*?"|[^"\s]+)(?=\s*|\s*$)/g

But it exits the current parameter when it encounters a double-quote, even if it is escaped. How can I adapt this RegEx to take care about the escaped or not double quotes ? And what about edge cases if I prompt action "windowsDirectory\\" otherArg, here the backslash is already escaped so even if it's followed by a double quote, it should exit the argument. This a problem I was trying to avoid as long as possible during previous projects, but I feel it's time for me to learn how to properly take under-account escape characters.

Here is a JS-Fiddle : http://jsfiddle.net/GwY8Y/1/ You can see that the beginning is well-parsed but the last arguments is split and bugs.

Thank you for any help.


回答1:


This regex will give you the strings you need (see demo):

"(?:\\"|\\\\|[^"])*"|\S+

Use it like this:

your_array = subject.match(/"(?:\\"|\\\\|[^"])*"|\S+/g);

Explain Regex

"                        # '"'
(?:                      # group, but do not capture (0 or more times
                         # (matching the most amount possible)):
  \\                     #   '\'
  "                      #   '"'
 |                       #  OR
  \\\\                   #   two backslashes
 |                       #  OR
  [^"]                   #   any character except: '"'
)*                       # end of grouping
"                        # '"'
|                        # OR
\S+                      # non-whitespace (all but \n, \r, \t, \f,
                         # and " ") (1 or more times (matching the
                         # most amount possible))


来源:https://stackoverflow.com/questions/24069344/split-spaces-avoiding-double-quoted-js-strings-from-a-b-c-d-to-a

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!