Apply several string transformations in scala

前端 未结 5 1819
温柔的废话
温柔的废话 2020-12-09 21:18

I want to perform several ordered and successive replaceAll(...,...) on a string in a functional way in scala.

What\'s the most elegant solution ? Scalaz welcome ! ;

相关标签:
5条回答
  • 2020-12-09 22:06

    First, let's get a function out of the replaceAll method:

    scala> val replace = (from: String, to: String) => (_:String).replaceAll(from, to)
    replace: (String, String) => String => java.lang.String = <function2>
    

    Now you can use Functor instance for function, defined in scalaz. That way you can compose functions, using map (or to make it look better, using unicode aliases).

    It will look like this:

    scala> replace("from", "to") ∘ replace("to", "from") ∘ replace("some", "none")
    res0: String => java.lang.String = <function1>
    

    If you prefer haskell-way compose (right to left), use contramap:

    scala> replace("some", "none") ∙ replace("to", "from") ∙ replace ("from", "to")
    res2: String => java.lang.String = <function1>
    

    You can also have some fun with Category instance:

    scala> replace("from", "to") ⋙ replace("to", "from") ⋙ replace("some", "none")
    res5: String => java.lang.String = <function1>
    
    scala> replace("some", "none") ⋘ replace("to", "from") ⋘ replace ("from", "to")
    res7: String => java.lang.String = <function1>
    

    And applying it:

    scala> "somestringfromto" |> res0
    res3: java.lang.String = nonestringfromfrom
    
    scala> res2("somestringfromto")
    res4: java.lang.String = nonestringfromfrom
    
    scala> "somestringfromto" |> res5
    res6: java.lang.String = nonestringfromfrom
    
    scala> res7("somestringfromto")
    res8: java.lang.String = nonestringfromfrom
    
    0 讨论(0)
  • 2020-12-09 22:09

    Another Scalaz-based solution to this problem would be to use the Endo monoid. This monoid captures the identity function (as the monoid's identity element) and function composition (as the monoid's append operation). This solution would be particularly useful if you have an arbitrarily-sized (even possibly empty) list of functions to apply.

    val replace = (from: String, to: String) => (_:String).replaceAll(from, to)
    
    val f: Endo[String] = List(
      replace("some", "none"),
      replace("to", "from"),
      replace("from", "to")    
    ).foldMap(_.endo)
    

    e.g. (using one of folone's examples)

    scala> f.run("somestringfromto")
    res0: String = nonestringfromfrom
    
    0 讨论(0)
  • 2020-12-09 22:09

    Define a replace function with anonymous parameters and then you can chain successive replace functions together.

    scala> val s = "hello world"
    res0: java.lang.String = hello world
    
    scala> def replace = s.replaceAll(_, _)
    replace: (java.lang.String, java.lang.String) => java.lang.String
    
    scala> replace("h", "H")  replace("w", "W")
    res1: java.lang.String = Hello World
    
    0 讨论(0)
  • 2020-12-09 22:15
    #to replace or remove multiple substrings in scala in dataframe's string column
    
    import play.api.libs.json._
    #to find
    def isContainingContent(str:String,regexStr:String):Boolean={
      val regex=new scala.util.matching.Regex(regexStr)
      val containingRemovables= regex.findFirstIn(str)
      containingRemovables match{
        case Some(s) => true
        case None => false
      }
    }
    val colContentPresent= udf((str: String,regex:String) => {
      isContainingContent(str,regex)
    })
    #to remove
    val cleanPayloadOfRemovableContent= udf((str: String,regexStr:String) => {
      val regex=new scala.util.matching.Regex(regexStr)
      val cleanedStr= regex.replaceAllIn(str,"")
      cleanedStr
    })
    #to define
    val removableContentRegex=
    "<log:Logs>[\\s\\S]*?</log:Logs>|\\\\n<![\\s\\S]*?-->|<\\?xml[\\s\\S]*?\\?>"
    
    #to call
    val dfPayloadLogPresent = dfXMLCheck.withColumn("logsPresentInit", colContentPresent($"payload",lit(removableContentRegex)))
    val dfCleanedXML = dfPayloadLogPresent.withColumn("payload", cleanPayloadOfRemovableContent($"payload",lit(removableContentRegex)))
    
    0 讨论(0)
  • 2020-12-09 22:19

    If its just a few invocations then just chain them. Otherwise I guess I'd try this:

    Seq("a" -> "b", "b" -> "a").foldLeft("abab"){case (z, (s,r)) => z.replaceAll(s, r)}
    

    Or if you like shorter code with confusing wildcards and extra closures:

    Seq("a" -> "b", "b" -> "a").foldLeft("abab"){_.replaceAll _ tupled(_)}
    
    0 讨论(0)
提交回复
热议问题