PHP: preg_match() not correct

天涯浪子 提交于 2019-12-11 00:09:33

问题


I have the following string:

<w:pPr>
    <w:spacing w:line="240" w:lineRule="exact"/>
    <w:ind w:left="1890" w:firstLine="360"/>
    <w:rPr>
        <w:b/>
        <w:color w:val="00000A"/>
        <w:sz w:val="24"/>
    </w:rPr>
</w:pPr>

and I am trying to parse the "w:sz w:val" value using preg_match().

So far, I've tried:

preg_match('/<w:sz w:val="(\d)"/', $p, $fonts);

but this has not worked, and I'm unsure why?

Any Ideas?

Thank you in advance!


回答1:


You were trying to capture only single-digit numbers. Try adding a + to make "one or more".

preg_match('/<w:sz w:val="(\d+)"/', $p, $fonts);

I prefer [0-9]+ for easier reading, and because it avoids the potentially funny need to double-up on \ symbols.

preg_match('/<w:sz w:val="([0-9]+)"/', $p, $fonts);



回答2:


While you have a working code at hand, there are two other possibilities, namely with DomDocument and SimpleXML. This is somewhat tricky with the colons (aka namespaces) but consider the following examples. I have added a container tag to define the namespace but you will definitely have one in your xml as well. Solution 1 (the DOM way) searches the DOM with a namespace prefix and reads the attributes. Solution 2 (with SimpleXML) does the same (perhaps in a more intuitive and comprehensible way).

The XML: (using PHP HEREDOC Syntax)

$xml = <<<EOF
<?xml version="1.0"?>
<container xmlns:w="http://example">
    <w:pPr>
        <w:spacing w:line="240" w:lineRule="exact"/>
        <w:ind w:left="1890" w:firstLine="360"/>
        <w:rPr>
            <w:b/>
            <w:color w:val="00000A"/>
            <w:sz w:val="24"/>
        </w:rPr>
    </w:pPr>
</container>
EOF;

Solution 1: Using DomDocument

$dom = new DOMDocument();
$dom->loadXML($xml);

$ns = 'http://example';

$data = $dom->getElementsByTagNameNS($ns, 'sz')->item(0);
$attr = $data->getAttribute('w:val');
echo $attr; // 24

Solution 2: Using SimpleXML with Namespaces

$simplexml = simplexml_load_string($xml);
$namespaces = $simplexml->getNamespaces(true);
$items = $simplexml->children($namespaces['w']);

$val = $items->pPr->rPr->sz["val"]->__toString();
echo "val: $val"; // val: 24



回答3:


You just need a little correction to your regex:

<w:sz w:val="(\d)+"

So it goes:

preg_match('/<w:sz w:val="(\d+)"/', $p, $fonts);

Why? Because with just \d you are checking for 1 digit, but with \d+ you are checking for 1 or more.

EDIT:

In case you need it, there are some great regex online testing tools, like https://regex101.com/. Try your expressions there before using them, just in case. You never know ;)



来源:https://stackoverflow.com/questions/33679620/php-preg-match-not-correct

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!