Splitting string into groups

后端 未结 9 967
再見小時候
再見小時候 2020-12-15 09:52

I\'m trying to \'group\' a string into segments, I guess this example would explain it more succintly

scala> val str: String = \"aaaabbcddeeeeeeffg\"
...          


        
相关标签:
9条回答
  • 2020-12-15 10:30

    You could use some helper functions like this:

    val str = "aaaabbcffffddeeeeefff"
    
    def zame(chars:List[Char]) = chars.partition(_==chars.head)
    
    def q(chars:List[Char]):List[List[Char]] = chars match {
        case Nil => Nil
        case rest =>
            val (thesame,others) = zame(rest)
            thesame :: q(others)
    }
    
    q(str.toList) map (_.mkString)
    

    This should do the trick, right? No doubt it can be cleaned up into one-liners even further

    0 讨论(0)
  • 2020-12-15 10:31

    If you want to use scala API you can use the built in function for that:

    str.groupBy(c => c).values
    

    Or if you mind it being sorted and in a list:

    str.groupBy(c => c).values.toList.sorted
    
    0 讨论(0)
  • 2020-12-15 10:33

    Starting Scala 2.13, List is now provided with the unfold builder which can be combined with String::span:

    List.unfold("aaaabbaaacdeeffg") {
      case ""   => None
      case rest => Some(rest.span(_ == rest.head))
    }
    // List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
    

    or alternatively, coupled with Scala 2.13's Option#unless builder:

    List.unfold("aaaabbaaacdeeffg") {
      rest => Option.unless(rest.isEmpty)(rest.span(_ == rest.head))
    }
    // List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
    

    Details:

    • Unfold (def unfold[A, S](init: S)(f: (S) => Option[(A, S)]): List[A]) is based on an internal state (init) which is initialized in our case with "aaaabbaaacdeeffg".
    • For each iteration, we span (def span(p: (Char) => Boolean): (String, String)) this internal state in order to find the prefix containing the same symbol and produce a (String, String) tuple which contains the prefix and the rest of the string. span is very fortunate in this context as it produces exactly what unfold expects: a tuple containing the next element of the list and the new internal state.
    • The unfolding stops when the internal state is "" in which case we produce None as expected by unfold to exit.
    0 讨论(0)
  • 2020-12-15 10:39

    Edit: Have to read more carefully. Below is no functional code.

    Sometimes, a little mutable state helps:

    def group(s : String) = {
      var tmp = ""
      val b = Seq.newBuilder[String]
    
      s.foreach { c =>
        if ( tmp != "" && tmp.head != c ) {
          b += tmp
          tmp = ""
        }
    
        tmp += c
      }
      b += tmp
    
      b.result
    }
    

    Runtime O(n) (if segments have at most constant length) and tmp.+= probably creates the most overhead. Use a string builder instead for strict runtime in O(n).

    group("aaaabbcddeeeeeeffg")
    > Seq[String] = List(aaaa, bb, c, dd, eeeeee, ff, g)
    
    0 讨论(0)
  • 2020-12-15 10:48
    def group(s: String): List[String] = s match {
      case "" => Nil
      case s  => s.takeWhile(_==s.head) :: group(s.dropWhile(_==s.head))
    }
    

    Edit: Tail recursive version:

    def group(s: String, result: List[String] = Nil): List[String] = s match {
      case "" => result reverse
      case s  => group(s.dropWhile(_==s.head), s.takeWhile(_==s.head) :: result)
    }
    

    can be used just like the other because the second parameter has a default value and thus doesnt have to be supplied.

    0 讨论(0)
  • Make it one-liner:

    scala>  val str = "aaaabbcffffddeeeeefff"
    str: java.lang.String = aaaabbcffffddeeeeefff
    
    scala> str.groupBy(identity).map(_._2)
    res: scala.collection.immutable.Iterable[String] = List(eeeee, fff, aaaa, bb, c, ffffdd)
    

    UPDATE:

    As @Paul mentioned about the order here is updated version:

    scala> str.groupBy(identity).toList.sortBy(_._1).map(_._2)
    res: List[String] = List(aaaa, bb, c, ffffdd, eeeee, fff)
    
    0 讨论(0)
提交回复
热议问题