Parsing command arguments in PHP

后端 未结 11 730
萌比男神i
萌比男神i 2020-12-11 01:19

Is there a native \"PHP way\" to parse command arguments from a string? For example, given the following string:

foo \"bar \\\"baz         


        
相关标签:
11条回答
  • Based on HamZa's answer:

    function parse_cli_args($cmd) {
        preg_match_all('#(?<!\\\\)("|\')(?<escaped>(?:[^\\\\]|\\\\.)*?)\1|(?<unescaped>\S+)#s', $cmd, $matches, PREG_SET_ORDER);
        $results = [];
        foreach($matches as $array){
            $results[] = !empty($array['escaped']) ? $array['escaped'] : $array['unescaped'];
        }
        return $results;
    }
    
    0 讨论(0)
  • 2020-12-11 01:39

    I would recommend going another way. There is already a "standard" way of doing command line arguments. it's called get_opts:

    http://php.net/manual/en/function.getopt.php

    I would suggest that you change your script to use get_opts, then anyone using your script will be passing parameters in a way that is familiar to them and kind of "industry standard" instead of having to learn your way of doing things.

    0 讨论(0)
  • 2020-12-11 01:47

    Regexes are quite powerful: (?s)(?<!\\)("|')(?:[^\\]|\\.)*?\1|\S+. So what does this expression mean ?

    • (?s) : set the s modifier to match newlines with a dot .
    • (?<!\\) : negative lookbehind, check if there is no backslash preceding the next token
    • ("|') : match a single or double quote and put it in group 1
    • (?:[^\\]|\\.)*? : match everything not \, or match \ with the immediately following (escaped) character
    • \1 : match what is matched in the first group
    • | : or
    • \S+ : match anything except whitespace one or more times.

    The idea is to capture a quote and group it to remember if it's a single or a double one. The negative lookbehinds are there to make sure we don't match escaped quotes. \1 is used to match the second pair of quotes. Finally we use an alternation to match anything that's not a whitespace. This solution is handy and is almost applicable for any language/flavor that supports lookbehinds and backreferences. Of course, this solution expects that the quotes are closed. The results are found in group 0.

    Let's implement it in PHP:

    $string = <<<INPUT
    foo "bar \"baz\"" '\'quux\''
    'foo"bar' "baz'boz"
    hello "regex
    
    world\""
    "escaped escape\\\\"
    INPUT;
    
    preg_match_all('#(?<!\\\\)("|\')(?:[^\\\\]|\\\\.)*?\1|\S+#s', $string, $matches);
    print_r($matches[0]);
    

    If you wonder why I used 4 backslashes. Then take a look at my previous answer.

    Output

    Array
    (
        [0] => foo
        [1] => "bar \"baz\""
        [2] => '\'quux\''
        [3] => 'foo"bar'
        [4] => "baz'boz"
        [5] => hello
        [6] => "regex
    
    world\""
        [7] => "escaped escape\\"
    )
    

                                           Online regex demo                                 Online php demo


    Removing the quotes

    Quite simple using named groups and a simple loop:

    preg_match_all('#(?<!\\\\)("|\')(?<escaped>(?:[^\\\\]|\\\\.)*?)\1|(?<unescaped>\S+)#s', $string, $matches, PREG_SET_ORDER);
    
    $results = array();
    foreach($matches as $array){
       if(!empty($array['escaped'])){
          $results[] = $array['escaped'];
       }else{
          $results[] = $array['unescaped'];
       }
    }
    print_r($results);
    

    Online php demo

    0 讨论(0)
  • 2020-12-11 01:47

    If you want to follow the rules of such parsing that are there as well as in shell, there are some edge-cases which I think aren't easy to cover with regular expressions and therefore you might want to write a method that does this (example):

    $string = 'foo "bar \"baz\"" \'\\\'quux\\\'\'';
    echo $string, "\n";
    print_r(StringUtil::separate_quoted($string));
    

    Output:

    foo "bar \"baz\"" '\'quux\''
    Array
    (
        [0] => foo
        [1] => bar "baz"
        [2] => 'quux'
    )
    

    I guess this pretty much matches what you're looking for. The function used in the example can be configured for the escape character as well as for the quotes, you can even use parenthesis like [ ] to form a "quote" if you like.

    To allow other than native bytesafe-strings with one character per byte you can pass an array instead of a string. the array needs to contain one character per value as a binary safe string. e.g. pass unicode in NFC form as UTF-8 with one code-point per array value and this should do the job for unicode.

    0 讨论(0)
  • 2020-12-11 01:48

    I wrote some packages for console interactions:

    Arguments parsing

    There is a package that does the whole arguments parsing thing weew/php-console-arguments

    Example:

    $parser = new ArgumentsParser();
    $args = $parser->parse('command:name arg1 arg2 --flag="custom \"value" -f="1+1=2" -vvv');
    

    $args will be an array:

    ['command:name', 'arg1', 'arg2', '--flag', 'custom "value', '-f', '1+1=2', '-v', '-v', '-v']
    

    Arguments can be grouped:

    $args = $parser->group($args);
    

    $args will become:

    ['arguments' => ['command:name', 'arg1', 'arg2'], 'options' => ['--flag' => 1, '-f' => 1, '-v' => 1], '--flag' => ['custom "value'], '-f' => ['1+1=2'], '-v' => []]
    

    It can do much more, just check the readme.

    Output styling

    You might need a package for output styling weew/php-console-formatter

    Console application

    Packages above can be used standalone or in combination with a fancy console application skeleton weew/php-console

    Note: This solutions are not native but might still be useful to some people.

    0 讨论(0)
提交回复
热议问题