Get the non-matching part of the pattern through a RegEx

不羁岁月 提交于 2020-01-01 17:08:13

问题


In this topic, the idea is to take "strip" the numerics, divided by a x through a RegEx. -> How to extract ad sizes from a string with excel regex

Thus from:

uni3uios3_300x250_ASDF.html

I want to achieve through RegEx:

300x250

I have managed to achieve the exact opposite and I am struggling some time to get what needs to be done. This is what I have until now:

Public Function regExSampler(s As String) As String

    Dim regEx           As Object
    Dim inputMatches    As Object
    Dim regExString     As String

    Set regEx = CreateObject("VBScript.RegExp")    
    With regEx
        .Pattern = "(([0-9]+)x([0-9]+))"
        .IgnoreCase = True
        .Global = True    
        Set inputMatches = .Execute(s)    
        If regEx.test(s) Then
            regExSampler = .Replace(s, vbNullString)
        Else
            regExSampler = s
        End If    
    End With

End Function

Public Sub TestMe()    
    Debug.Print regExSampler("uni3uios3_300x250_ASDF.html")
    Debug.Print regExSampler("uni3uios3_34300x25_ASDF.html")
    Debug.Print regExSampler("uni3uios3_8x4_ASDF.html")    
End Sub

If you run TestMe, you would get:

uni3uios3__ASDF.html 
uni3uios3__ASDF.html
uni3uios3__ASDF.html

And this is exactly what I want to strip through RegEx.


回答1:


Change the IF block to

    If regEx.test(s) Then
        regExSampler = InputMatches(0)
    Else
        regExSampler = s
    End If

And your results will return

300x250
34300x25
8x4

This is because InputMatches holds the results of the RegEx execution, which holds the pattern you were matching against.




回答2:


As requested by the OP, I'm posting this as an answer:

Solution:

^.*\D(?=\d+x\d+)|\D+$

Demonstration: regex101.com

Explanation:

  • ^.*\D - Here we're matching every character from the start of the string until it reaches a non-digit (\D) character.
  • (?=\d+x\d+) - This is a positive lookahead. It means that the previous pattern (^.*\D) should only match if followed by the pattern described inside it (\d+x\d+). The lookahead itself doesn't capture any character, so the pattern \d+x\d+ isn't captured by the regex.

  • \d+x\d+ - This one should be easy to understand because it's equivalent to [0-9]+x[0-9]+. As you see, \d is a token that represents any digit character.

  • \D+$ - This pattern matches one or more non-digit characters until it reaches the end of the string.
  • Finally, both patterns are linked by an OR condition (|) so that the whole regex matches one pattern or another.


来源:https://stackoverflow.com/questions/48447265/get-the-non-matching-part-of-the-pattern-through-a-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!