How to remove text between tags in php?

前端 未结 6 883
时光取名叫无心
时光取名叫无心 2020-12-06 00:56

Despite using PHP for years, I\'ve never really learnt how to use expressions to truncate strings properly... which is now biting me in the backside!

Can anyone prov

相关标签:
6条回答
  • 2020-12-06 00:58

    Only use strip_tags(), that would get rid of the tags and left only the desired text between them

    0 讨论(0)
  • 2020-12-06 01:12

    You could use substring in combination with stringpos, eventhough this is not a very nice approach.

    Check: PHP Manual - String functions

    Another way would be to write a regular expression to match your criteria. But in order to get your problem solved quickly the string functions will do...

    EDIT: I underestimated the audience. ;) Go ahead with the regexes... ^^

    0 讨论(0)
  • 2020-12-06 01:15
    $str = preg_replace('#(<a.*?>).*?(</a>)#', '$1$2', $str)
    
    0 讨论(0)
  • 2020-12-06 01:15

    What about something like this, considering you might want to re-use it with other hrefs :

    $str = '<a href="link.html">text</a>';
    $result = preg_replace('#(<a[^>]*>).*?(</a>)#', '$1$2', $str);
    var_dump($result);
    

    Which will get you :

    string '<a href="link.html"></a>' (length=24)
    

    (I'm considering you made a typo in the OP ? )


    If you don't need to match any other href, you could use something like :

    $str = '<a href="link.html">text</a>';
    $result = preg_replace('#(<a href="link.html">).*?(</a>)#', '$1$2', $str);
    var_dump($result);
    

    Which will also get you :

    string '<a href="link.html"></a>' (length=24)
    


    As a sidenote : for more complex HTML, don't try to use regular expressions : they work fine for this kind of simple situation, but for a real-life HTML portion, they don't really help, in general : HTML is not quite "regular" "enough" to be parsed by regexes.

    0 讨论(0)
  • 2020-12-06 01:21

    Using SimpleHTMLDom:

    <?php
    // example of how to modify anchor innerText
    include('simple_html_dom.php');
    
    // get DOM from URL or file
    $html = file_get_html('http://www.example.com/');
    
    //set innerText to null for each anchor
    foreach($html->find('a') as $e) {
        $e->innerText = null;
    }
    
    // dump contents
    echo $html;
    ?>
    
    0 讨论(0)
  • 2020-12-06 01:21

    You don't need to capture the tags themselves. Just target the text between the tags and replace it with an empty string. Super simple.

    Demo of both techniques

    Code:

    $string = '<a href="link.html">text</a>';
    echo preg_replace('/<a[^>]*>\K[^<]*/', '', $string);
    // the opening tag--^^^^^^^^  ^^^^^-match everything before the end tag
    //                          ^^-restart fullstring match
    

    Output:

    <a href="link.html"></a>
    

    Or in fringe cases when the link text contains a <, use this: ~<a[^>]*>\K.*?(?=</a>)~

    This avoids the expense of capture groups using a lazy quantifier, the fullstring restarting \K and a "lookahead".


    Older & wiser:

    If you are parsing valid html, you should use a dom parser for stability/accuracy. Regex is DOM-ignorant, so if there is a tag attribute value containing a >, my snippet will fail.

    As a narrowly suited domdocument solution to offer some context:

    $dom = new DOMDocument;
    $dom->loadHTML($string, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD); // 2nd params to remove DOCTYPE);
    $dom->getElementsByTagName('a')[0]->nodeValue = '';
    echo $dom->saveHTML();
    
    0 讨论(0)
提交回复
热议问题