Matching a multiple lines pattern via PHP's preg_match()

后端 未结 6 851
轻奢々
轻奢々 2020-12-10 00:33

How can I match subject via a PHP preg_match() regular expression pattern in this HTML code:

      
相关标签:
6条回答
  • 2020-12-10 01:01

    You have to remove all line breaks using \s in the regular expression:

    $str ="<ol>
             <li>Capable for unlimited product</li>
             <li>Two currency support</li>
             <li>Works with touch screens and click screen based systems</li>
             <li>Responsive design <b>shopping cart</b>, Specially design for Mac, iPhone, iPad, PC and Android</li>
             <li>VAT for countries that support a Value Added Tax</li>
             <li>Barcode scanner checkout option for POS</li>
             <li>mRSS</li>
           </ol>";
    
    preg_match("/^([A-Za-z0-9\s\<\>\.\,\/\-\ ]+)$/", $str);
    
    // Sanitize your code before save to database.
    
    function test_input($data) {
        $data = trim($data);
        $data = htmlspecialchars($data);
        $data = json_encode($data);
        $data = addslashes($data);
        return $data;
    }
    
    echo test_input($str);
    
    0 讨论(0)
  • 2020-12-10 01:05

    Catch a block of code separated by 4 four backticks (as the markdown syntax).

    Example to be adapted easily.

    <?php
    
    $str = '
    # Some Text
    
    ```` 
        h5 {
          font-size: 1rem;
          font-weight: 600;
        }
    ````
    
    And some text.
    ';
    
    $reg = '/````[^>]*(.*?)````/';
    
    preg_match($reg, $str, $matches);
    echo $matches[0];
    
    /* OUTPUT
    ```` 
        h5 {
          font-size: 1rem;
          font-weight: 600;
        }
    ````
    */
    
    echo preg_replace($reg, "DELETED", $str);
    
    /* OUTPUT
    # Some Text
    
    DELETED
    
    And some text.
    */
    
    0 讨论(0)
  • 2020-12-10 01:10

    You can add the m operator to your regular expression:

    // Given your HTML content.
    $html = 'Your HTML content';
    preg_match('/<td[^>]*>(.*?)<\/td>/im', $html, $matches);
    

    Hope this (still) helps, hahaha.

    0 讨论(0)
  • 2020-12-10 01:19

    If you're looking for (e.g.) a h2 tag nested within a td tag where there's only whitespace in between the two, just use \s which includes spaces, newlines, etc. eg::

    preg_match('#<td>\s*<h2>(.*?)</h2>\s*</td>#i',$str,$matches);
    // result is in $matches[1]
    

    See it in action here.

    For your interest, here is a list of different modifiers you can pass in to preg_* functions. Flags that may interest you are:

    • s ("dotall") : this one makes . match every character, including newlines. So, say your <h2>.....</h2> was spread over multiple lines. Then you'd have to do

      preg_match('#<td>\s*<h2>(.*?)</h2>\s*</td>#is',$str,$matches);
      

      in order to have the .* go over multiple lines (see the extra s at the end of the regex?).

    • m ("multiline") : this one just lets ^ and $ match start/end of line instead of just the start/end of string. You only really need it if you're using ^ and $ in your pattern and want them to match the start/end of each individual line in your input.
    0 讨论(0)
  • 2020-12-10 01:20

    You shouldn't use regex to parse HTML content. It can cause a lot of issues if you cannot control what the user can input. There are a lot of better solutions in every language. An XML parser in most of the cases is doing a better job. Check out DOMDocument, simplehtmldom or php-html-parser

    See here for more answers why you shouldn't use regex on HTML content: RegEx match open tags except XHTML self-contained tags

    0 讨论(0)
  • 2020-12-10 01:25

    Very simply with

    preg_match('/<h2>(.*?)<\\/h2>/', $str, $matches);
    print($matches[1]);
    

    The multi-line format has no effect on the regex unless you need to match a string that spans multiple lines.

    0 讨论(0)
提交回复
热议问题