Capturing all method arguments default values

天涯浪子 提交于 2019-12-13 11:28:10

问题


I'm working on reverse engineering PHP methods because provided \ReflectionClass mechanism is insufficient for my current project.

Currently I want to get using regular expressions method prototypes. I got stuck on retrieving default argument values. I'm providing static method MethodArgs::createFromString() with the contents of method prototype parentheses. It's goal is to get all arguments from string including argument type, name ... and default value and create an instance of itself. So far I've been able to successfully retrieve default values for string's both single quoted and double quoted including exceptional cases like ' \' ' or " \" ". But range of scalar values that PHP accepts for default argument value is a bit larger. I'm having problems extending my regexp to match also types like booleans, integers, floats or arrays.

<?php
class MethodArgs
{
    static public function createFromString($str) {
        $str = "   Peer \$M = null, Template \$T='variable \'value', \BlaBla\Bla \$Bla = \" blablabla \\\" bleble \"   ";

        //$pat = '#(?:(?:\'|")(?<val>(?:[^\'"]|(?<=\\\)(?:\'|"))*)(?:\'|"))+#i';
        //$pat = '#(?:(?<type>[^\$\s,\(\)]+)\s)?\$(?<name>[^,.\s\)=]+)(?:\s*=\s*)?(?:\'(?<val>(?:[^\']|(?<=\\\)\')*)\')?#i';
        $pat = '#(?:(?<type>[^\$\s,\(\)]+)\s)?\$(?<name>[^,.\s\)=]+)(?:\s*=\s*)?(?:(?:\'|")(?<val>(?:[^\'"]|(?<=\\\)(?:\'|"))*)(?:\'|"))?#i';

        $a = preg_match_all($pat, $str, $match);
        var_dump(array('$a' => $a, '$pat' => $pat, '$str' => $str, '$match' => $match));
        die();

        /*$Args = new static();
        for($i=0; $i<count($match[0]); $i++) {
            $Arg = new MethodArg();
            $Arg->setType($match['type'][$i]);
            $Arg->setName($match['name'][$i]);
            $Arg->setDefaultValue($match['val'][$i]);
            $Args[] = $Arg;
        }

        return $Args;*/
    }
}

Output ( screenshot ):

Array
(
    [$a] => 3
    [$pat] => #(?:(?[^\$\s,\(\)]+)\s)?\$(?[^,.\s\)=]+)(?:\s*=\s*)?(?:(?:'|")(?(?:[^'"]|(?    Peer $M = null, Template $T='variable \'value', \BlaBla\Bla $Bla = " blablabla \" bleble "   
    [$match] => Array
        (
            [0] => Array
                (
                    [0] => Peer $M = 
                    [1] => Template $T='variable \'value'
                    [2] => \BlaBla\Bla $Bla = " blablabla \" bleble "
                )

            [type] => Array
                (
                    [0] => Peer
                    [1] => Template
                    [2] => \BlaBla\Bla
                )

            [1] => Array
                (
                    [0] => Peer
                    [1] => Template
                    [2] => \BlaBla\Bla
                )

            [name] => Array
                (
                    [0] => M
                    [1] => T
                    [2] => Bla
                )

            [2] => Array
                (
                    [0] => M
                    [1] => T
                    [2] => Bla
                )

            [val] => Array
                (
                    [0] => 
                    [1] => variable \'value
                    [2] =>  blablabla \" bleble 
                )

            [3] => Array
                (
                    [0] => 
                    [1] => variable \'value
                    [2] =>  blablabla \" bleble 
                )

        )

)

~ Thanks in advance for any advice


回答1:


If you are trying to parse single or double quoted strings, it should be done
in two steps. Validation, then parse for values.

You could probably do both in a single regex with the use of a \G anchor,
validating with \A\G and parsing with just the \G.

If you are sure its valid, you can skip the validation.
Below are the two parts (can be combined if needed).
Note that it parses the single or double quotes using the un-rolled loop method,
which is pretty quick.

Validation:

 # Validation:  '~^(?s)[^"\']*(?:"[^"\\\]*(?:\\\.[^"\\\]*)*"|\'[^\'\\\]*(?:\\\.[^\'\\\]*)*\'|[^"\'])*$~'

 ^
 (?s)
 [^"']*
 (?:
      "
      [^"\\]*
      (?: \\ . [^"\\]* )*
      "
   |
      '
      [^'\\]*
      (?: \\ . [^'\\]* )*
      '
   |
      [^"']
 )*
 $

Parsing:

 # Parsing:  '~(?s)(?|"([^"\\\]*(?:\\\.[^"\\\]*)*)"|\'([^\'\\\]*(?:\\\.[^\'\\\]*)*)\')~'

 (?s)                          # Dot all modifier
 (?|                           # Branch Reset
      "
      (                             # (1), double quoted string data
           [^"\\]*
           (?: \\ . [^"\\]* )*
      )
      "
   |                              # OR
      '
      (                             # (1), single quoted string data
           [^'\\]*
           (?: \\ . [^'\\]* )*
      )
      '
 )


来源:https://stackoverflow.com/questions/28270832/capturing-all-method-arguments-default-values

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!