VBscript regex replace

孤街醉人 提交于 2020-07-18 21:31:37

问题


I have no idea why this is only applying to the last instance found, not all of them as I would expect. Any help appreciated.

Input string:

<a href="http://www.scirra.com" target="_blank" rel="nofollow">http://www.scirra.com</a><br /><br />
<a href="http://www.scirra.com" target="_blank" rel="nofollow">http://www.scirra.com</a><br /><hr>

Regex:

'SEO scirra links
Dim regEx
Set regEx = New RegExp

' BB code urls
With regEx
    .Pattern = "<a href=\""http://www.scirra.com([^\]]+)\"" target=\""_blank\"" rel=\""nofollow\"">"
    .IgnoreCase = True
    .Global = True
    .MultiLine = True
End With
strMessage = regEx.Replace(strMessage, "<a href=""http://www.scirra.com$1"" target=""_blank"" title=""Some value insert here"">")

set regEx = nothing

Output:

<a href="http://www.scirra.com" target="_blank" rel="nofollow">http://www.scirra.com</a><br /><br />
<a href="http://www.scirra.com" target="_blank" title="Some value insert here">http://www.scirra.com</a><br /><hr>

Can anyone shed light on why it's only adding the title to the last found instance? (I've tested with more, always just applies to last one)


回答1:


It is because of this in your regex:

...a.com-->([^\]]+)<--

You try and match everything which is not a ], once or more, in your input. And since there are no ] at all in your input, it swallows everything (yes, even newlines), but has to backtrack in order to satisfy the rest of your regex, which means it backtracks to the last occurrence of " target="_blank" .....

If you want to replace the rel="nofollow" and allow any path behind http://www.scirra.com, you can use this regex instead:

(<a href="http://www\.scirra\.com((/[^/"]+)*/?)" target="_blank" )rel="nofollow">

and replace that with:

$1title="Some value insert here">

Copy/pasting your current code:

Dim regEx
Set regEx = New RegExp

' BB code urls
With regEx
    .Pattern = "(<a href=""http://www\.scirra\.com((/[^""/]+)*/?)"" target=\""_blank\"" )rel=\""nofollow\"">"
    .IgnoreCase = True
    .Global = True
    .MultiLine = True
End With
strMessage = regEx.Replace(strMessage, "$1title=""Some value insert here"">")

Note however that this is quite restrictive in the replaced URLs. For instance, is there the possibility that the target content be something else, or that there are more attributes?



来源:https://stackoverflow.com/questions/8859541/vbscript-regex-replace

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!