How to get multiple regex on same string in scala

醉酒当歌 提交于 2020-01-30 08:52:25

问题


My requirement is to get multiple regex patterns in a given String.

"<a href=\"https://page1.google.com/ab-cd/ABCDEF\”>Hello</a> hiiii <a href=\"https://page2.yahoo.com/gr\”>page</a><img src=\"https://image01.google.com/gr/content/attachment/987654321\” alt=\”demo image\”></a><a href=\"https://page3.google.com/hr\">"

With this below code:

val p = Pattern.compile("href=\"(.*?)\"")
    val m = p.matcher(str)
    while(m.find()){
      println(m.group(1))
    }

I am getting output:

https://page1.google.com/ab-cd/ABCDEF
https://page2.yahoo.com/gr
https://page3.google.com/hr

With change in Pattern:

val p = Pattern.compile("img src=\"(.*?)\"")

I am getting output:

https://image01.google.com/gr/content/attachment/987654321

But with Pattern:

val p = Pattern.compile("href=\"(.*?)\"|img src=\"(.*?)\"")

I am getting output:

https://page1.google.com/ab-cd/ABCDEF
https://page2.yahoo.com/gr
Null
https://page3.google.com/hr 

Please let me know, how to get multiple regex pattern or is their any other easy way to do this?

Thanks


回答1:


You may use

val rx = "(?:href|img src)=\"(.*?)\"".r
val results = rx.findAllMatchIn(s).map(_ group 1)
// println(results.mkString(", ")) prints:
//  https://page1.google.com/ab-cd/ABCDEF, 
//  https://page2.yahoo.com/gr, 
//  https://image01.google.com/gr/content/attachment/987654321, 
//  https://page3.google.com/hr

See the Scala demo

Details

  • (?:href|img src)=\"(.*?)\" matches either href or img src, then a =", and then captures any 0+ chars other than line break chars as few as possible into Group 1, and then a " is matched
  • With .findAllIn, you get all matches, then .map(_ group 1) only fetches Group 1 values.


来源:https://stackoverflow.com/questions/56473995/how-to-get-multiple-regex-on-same-string-in-scala

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!