extract substring using regex in groovy

前提是你 提交于 2019-12-23 06:44:34

问题


If I have the following pattern in some text:

def articleContent =  "<![CDATA[ Hellow World ]]>"

I would like to extract the "Hellow World" part, so I use the following code to match it:

def contentRegex = "<![CDATA[ /(.)*/ ]]>"
def contentMatcher = ( articleContent =~ contentRegex )
println contentMatcher[0]

However I keep getting a null pointer exception because the regex doesn't seem to be working, what would be the correct regex for "any peace of text", and how to collect it from a string?


回答1:


Try:

def result = (articleContent =~ /<!\[CDATA\[(.+)]]>/)[ 0 ]​[ 1 ]

However I worry that you are planning to parse xml with regular expressions. If this cdata is part of a larger valid xml document, better to use an xml parser




回答2:


The code below shows the substring extraction using regex in groovy:

class StringHelper {
@NonCPS
static String stripSshPrefix(String gitUrl){
    def match = (gitUrl =~ /ssh:\/\/(.+)/)
    if (match.find()) {
        return match.group(1)
    }
    return gitUrl
  }
static void main(String... args) {
    def gitUrl = "ssh://git@github.com:jiahut/boot.git"
    def gitUrl2 = "git@github.com:jiahut/boot.git"
    println(stripSshPrefix(gitUrl))
    println(stripSshPrefix(gitUrl2))
  }
}



回答3:


A little bit late to the party but try using backslash when defining your pattern, example:

 def articleContent =  "real groovy"
 def matches = (articleContent =~ /gr\w{4}/) //grabs 'gr' and its following 4 chars
 def firstmatch = matches[0]  //firstmatch would be 'groovy'

you were on the right track, it was just the pattern definition that needed to be altered.

References:

https://www.regular-expressions.info/groovy.html

http://mrhaki.blogspot.com/2009/09/groovy-goodness-matchers-for-regular.html




回答4:


One more sinle-line solution additional to tim_yates's one

def result = articleContent.replaceAll(/<!\[CDATA\[(.+)]]>/,/$1/)

Please, take into account that in case of regexp doesn't match then result will be equal to the source. Unlikely in case of

def result = (articleContent =~ /<!\[CDATA\[(.+)]]>/)[0]​[1]

it will raise an exception.



来源:https://stackoverflow.com/questions/17536921/extract-substring-using-regex-in-groovy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!