RegEx to return 'href' attribute of 'link' tags only?

有些话、适合烂在心里 提交于 2019-11-27 07:16:10

问题


Im trying to craft a regex that only returns <link> tag hrefs

Why does this regex return all hrefs including <a hrefs?

    (?<=<link\s+.*?)href\s*=\s*[\'\"][^\'\"]+
    <link rel="stylesheet" rev="stylesheet" 
    href="idlecore-tidied.css?T_2_5_0_228" media="screen">
    <a href="anotherurl">Slash Boxes</a>

thank you


回答1:


Either

/(?<=<link\b[^<>]*?)\bhref=\s*=\s*(?:"[^"]*"|'[^']'|\S+)/

or

/<link\b[^<>]*?\b(href=\s*=\s*(?:"[^"]*"|'[^']'|\S+))/

The main difference is [^<>]*? instead of .*?. This is because you don't want it to continue the search into other tags.




回答2:


Avoid lookbehind for such simple case, just match what you need, and capture what you want to get.

I got good results with <link\s+[^>]*(href\s*=\s*(['"]).*?\2) in The Regex Coach with s and g options.




回答3:


/(?<=<link\s+.*?)href\s*=\s*[\'\"][^\'\"]+[^>]*>/

i'm a little shaky on the back-references myself, so I left that in there. This regex though:

/(<link\s+.*?)href\s*=\s*[\'\"][^\'\"]+[^>]*>/

...works in my Javascript test.




回答4:


(?<=<link\s+.*?)href\s*=\s*[\'\"][^\'\"]+

works with Expresso (I think Expresso runs on the .NET regex-engine). You could even refine this a bit more to match the closing ' or ":

(?<=<link\s+.*?)href\s*=\s*([\'\"])[^\'\"]+(\1)

Perhaps your regex-engine doesn't work with lookbehind assertions. A workaround would be

(?:<link\s+.*?)(href\s*=\s*([\'\"])[^\'\"]+(\2))

Your match will then be in the captured group 1.




回答5:


What regex flavor are you using? Perl, for one, doesn't support variable-length lookbehind. Where that's an option, I'd choose (edited to implement the very good idea from MizardX):

(?<=<link\b[^<>]*?)href\s*=\s*(['"])(?:(?!\1).)+\1

as a first approximation. That way the choice of quote character (' or ") will be matched. The same for a language without support for (variable-length) lookbehind:

(?:<link\b[^<>]*?)(href\s*=\s*(['"])(?:(?!\2).)+\2)

\1 will contain your match.



来源:https://stackoverflow.com/questions/268338/regex-to-return-href-attribute-of-link-tags-only

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!