Regular expression to match all characters between <h1> tag

安稳与你 提交于 2019-12-03 05:04:26

问题


I'm using sublime text 2 editor. I would like to use regex to match all character between all h1 tags.

As of now i'm using like this

<h1>.+</h1>

Its working fine if the h1 tag doesn't have breaks.

I mean for

<h1>Hello this is a hedaer</h1>

its working fine.

But its not working if the tag look like this

<h1>
   Hello this is a hedaer
</h1>

Can someone help me with the syntax?


回答1:


By default . matches every character except new line character.

In this case, you will need DOTALL option, which will make . matches any character, including new line character. DOTALL option can be specified inline as (?s). For example:

(?s)<h1>.+</h1>

However, you will see that it will not work, since the default behavior of the quantifier is greedy (in this case its +), which means that it will try to consume as many characters as possible. You will need to make it lazy (consume as few characters as possible) by adding extra ? after the quantifier +?:

(?s)<h1>.+?</h1>

Alternatively, the regex can be <h1>[^<>]*</h1>. In this case, you don't need to specify any option.




回答2:


Since this question is the top Google results search for a regex trying to find all the characters between an h1 tag I thought I would give that answer as well. Since that was what I was looking for.

(?s)(?<=<h1>)(.+?)(?=</h1>)

That regex, if used on a sample text like <h1>A title</h1> <p>Some content</p> <h1>Another title</h1> will only return A title.



来源:https://stackoverflow.com/questions/14525286/regular-expression-to-match-all-characters-between-h1-tag

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!