Scala Regex enable Multiline option

我与影子孤独终老i 提交于 2019-11-28 05:54:58
Daniel C. Sobral

This is a very common problem when first using Scala Regex.

When you use pattern matching in Scala, it tries to match the whole string, as if you were using "^" and "$" (and did not activate multi-line parsing, which matches \n to ^ and $).

The way to do what you want would be one of the following:

def matchNode( value : String ) : Boolean = 
  (ScriptNode findFirstIn value) match {    
    case Some(v) => println( "found" + v ); true    
    case None => println("not found: " + value ) ; false
  }

Which would find find the first instance of ScriptNode inside value, and return that instance as v (if you want the whole string, just print value). Or else:

val ScriptNode =  new Regex("""(?s).*<com:Node>.*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode() => println( "found" + value ); true    
    case _ => println("not found: " + value ) ; false
  }

Which would print all all value. In this example, (?s) activates dotall matching (ie, matching "." to new lines), and the .* before and after the searched-for pattern ensures it will match any string. If you wanted "v" as in the first example, you could do this:

val ScriptNode =  new Regex("""(?s).*(<com:Node>).*""")
def matchNode( value : String ) : Boolean = 
  value match {    
    case ScriptNode(v) => println( "found" + v ); true    
    case _ => println("not found: " + value ) ; false
  }

Just a quick and dirty addendum: the .r method on RichString converts all strings to scala.util.matching.Regex, so you can do something like this:

"""(?s)a.*b""".r replaceAllIn ( "a\nb\nc\n", "A\nB" )

And that will return

A
B
c

I use this all the time for quick and dirty regex-scripting in the scala console.

Or in this case:

def matchNode( value : String ) : Boolean = {

    """(?s).*(<com:Node>).*""".r.findAllIn( text ) match {

       case ScriptNode(v) => System.out.println( "found" + v ); true    

       case _ => System.out.println("not found: " + value ) ; false
    }
}

Just my attempt to reduce the use of the word new in code worldwide. ;)

Just a small addition, use tried to use the (?m) (Multiline) flag (although it might not be suitable here) but here is the right way to use it:

e.g. instead of

val ScriptNode =  new Regex("""<com:Node>?m""")

use

val ScriptNode =  new Regex("""(?m)<com:Node>""")

But again the (?s) flag is more suitable in this question (adding this answer only because the title is "Scala Regex enable Multiline option")

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!