Regex that extracts text between tags, but not the tags

﹥>﹥吖頭↗ 提交于 2019-11-29 14:55:38

You can use this following Regex:

>([^<]*)<

or, >[^<]*<

Then eliminate unwanted characters like '<' & '>'

the best way is to use Assertions, for your case, the regex would be:

(?<=\<title\>).*?(?=\<\/title\>)

for more details have a look here

In your case, you could just use the second backreference from the regex, which would hold the text you are interested in.

Since you mention preg_match in your tags, I am assuming you want this for PHP.

$matches = array();
$pattern = '#<title>(.*?)</title>#'; // note I changed the pattern a bit
preg_match($pattern, $string, $matches);
$title = $matches[1];

Note that this is actually the first back reference in my patterns, since I've omitted the parentheses around the tags themselves, which were not needed.

Typically, you should not use Regex to parse HTML documents, but I think this might be one of those exception cases, where it is not so bad, since the title tag should only exist once on the page.

I used this as a replace function of Regex: (<.+?>)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!