Regex that extracts text between tags, but not the tags

后端 未结 4 429
梦谈多话
梦谈多话 2020-12-12 00:16

I want to write a regex which extract the content that is between two tags </code> in a string but not the tags. IE I have the following</p> <pre><code> <script async src="https://pagead2.googlesyndication.com/pagead/js/adsbygoogle.js"></script> <ins class="adsbygoogle" style="display:block" data-ad-client="ca-pub-5408099190056760" data-ad-slot="7305827575" data-ad-format="auto" data-full-width-responsive="true"></ins> <script> (adsbygoogle = window.adsbygoogle || []).push({}); </script> </div> </div> <div class="fly-panel detail-box" id="flyReply"> <fieldset class="layui-elem-field layui-field-title" style="text-align: center;"> <legend>4条回答</legend> </fieldset> <ul class="jieda" id="jieda"> <li data-id="111" class="jieda-daan"> <a name="item-1111111111"></a> <div class="detail-about detail-about-reply"> <a class="fly-avatar" href=""> <img src="https://www.e-learn.cn/qa/data/avatar/000/00/00/small_000000026.jpg" alt=" 星月不相逢 "> </a> <div class="fly-detail-user"> <a href="" class="fly-link"> <cite> 星月不相逢</cite> </a> <span>(楼主)</span> </div> <div class="detail-hits"> <span>2020-12-12 01:11</span> </div> </div> <div class="detail-body jieda-body photos"> <p> <p>In your case, you could just use the second backreference from the regex, which would hold the text you are interested in.</p> <p>Since you mention <code>preg_match</code> in your tags, I am assuming you want this for PHP.</p> <pre><code>$matches = array(); $pattern = '#<title>(.*?)#'; // note I changed the pattern a bit preg_match($pattern, $string, $matches); $title = $matches[1];

Note that this is actually the first back reference in my patterns, since I've omitted the parentheses around the tags themselves, which were not needed.

Typically, you should not use Regex to parse HTML documents, but I think this might be one of those exception cases, where it is not so bad, since the title tag should only exist once on the page.

提交回复
热议问题