Scala regex pattern match of ip address

守給你的承諾、 提交于 2021-02-16 15:09:51

问题


I can't understand why this code returns false:

      val reg = """.*(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
      "ttt20.30.4.140ttt" match{
        case reg(one, two, three, four) =>
          if (host == one + "." + two + "." + three + "." + four) true else false
        case _ => false
      }

and only if I change it to:

  val reg = """.*(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
  "20.30.4.140" match{
    case reg(one, two, three, four) =>
      if (host == one + "." + two + "." + three + "." + four) true else false
    case _ => false
  }

it does match


回答1:


Your variant

def main( args: Array[String] ) : Unit = {
  val regex = """.*(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
  val x = "ttt20.30.4.140ttt"

  x match {
    case regex(ip1,ip2,ip3,ip4) => println(ip1, ip2, ip3, ip4)
    case _ => println("No match.")
  }
}

matches, but not as you intend. Result will be (0,30,4,140) instead of (20,30,4,140). As you can see .* is greedy, so consumes as much input as it can.

e.g. ab12 could be separated via .*(\d{1,3}) into

  • ab and 12
  • ab1 and 2 .... this is the variant chosen, as .* consumes as much input as it can

Solutions

  1. Make .* reluctant (and not greedy), that is .*? so in total

    """.*?(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
    
  2. Precisely define the pattern before the first number, e.g. if these are only characters, do

    """[a-zA-Z]*(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
    



回答2:


You should use reluctant quantifier rather than greedy quantifier:

val reg = """.*?(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r


来源:https://stackoverflow.com/questions/33008914/scala-regex-pattern-match-of-ip-address

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!