Scala: Merge maps by key

匿名 (未验证) 提交于 2019-12-03 08:57:35

问题:

Say I have two maps:

val a = Map(1 -> "one", 2 -> "two", 3 -> "three") val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois") 

I want to merge these maps by key, applying some function to collect the values (in this particular case I want to collect them into a seq, giving:

val c = Map(1 -> Seq("one", "un"), 2->Seq("two", "deux"), 3->Seq("three", "trois")) 

It feels like there should be a nice, idiomatic way of doing this - any suggestions? Am happy if the solution involves scalaz.

回答1:

scala.collection.immutable.IntMap has an intersectionWith method that does precisely what you want (I believe):

import scala.collection.immutable.IntMap  val a = IntMap(1 -> "one", 2 -> "two", 3 -> "three", 4 -> "four") val b = IntMap(1 -> "un", 2 -> "deux", 3 -> "trois")  val merged = a.intersectionWith(b, (_, av, bv: String) => Seq(av, bv)) 

This gives you IntMap(1 -> List(one, un), 2 -> List(two, deux), 3 -> List(three, trois)). Note that it correctly ignores the key that only occurs in a.

As a side note: I've often found myself wanting the unionWith, intersectionWith, etc. functions from Haskell's Data.Map in Scala. I don't think there's any principled reason that they should only be available on IntMap, instead of in the base collection.Map trait.



回答2:

val a = Map(1 -> "one", 2 -> "two", 3 -> "three") val b = Map(1 -> "un", 2 -> "deux", 3 -> "trois")  val c = a.toList ++ b.toList val d = c.groupBy(_._1).map{case(k, v) => k -> v.map(_._2).toSeq} //res0: scala.collection.immutable.Map[Int,Seq[java.lang.String]] =         //Map((2,List(two, deux)), (1,List(one, un), (3,List(three, trois))) 


回答3:

Scalaz adds a method |+| for any type A for which a Semigroup[A] is available.

If you mapped your Maps so that each value was a single-element sequence, then you could use this quite simply:

scala> a.mapValues(Seq(_)) |+| b.mapValues(Seq(_)) res3: scala.collection.immutable.Map[Int,Seq[java.lang.String]] = Map(1 -> List(one, un), 2 -> List(two, deux), 3 -> List(three, trois)) 


回答4:

So I wasn't quite happy with either solution (I want to build a new type, so semigroup doesn't really feel appropriate, and Infinity's solution seemed quite complex), so I've gone with this for the moment. I'd be happy to see it improved:

def merge[A,B,C](a : Map[A,B], b : Map[A,B])(c : (B,B) => C) = {   for (     key 

I wanted the behaviour of returning nothing when a key wasn't present in either map (which differs from other solutions), but a way of specifying this would be nice.



回答5:

Here is my first approach before looking for the other solutions:

for (x  Seq (a.get (x._1), b.get (x._1)).flatten 

To avoid elements which happen to exist only in a or b, a filter is handy:

(for (x  Seq (a.get (x._1), b.get (x._1)).flatten).filter (_._2.size == 2) 

Flatten is needed, because b.get (x._1) returns an Option. To make flatten work, the first element has to be an option too, so we can't just use x._2 here.

For sequences, it works too:

scala> val b = Map (1 -> Seq(1, 11, 111), 2 -> Seq(2, 22), 3 -> Seq(33, 333), 5 -> Seq(55, 5, 5555)) b: scala.collection.immutable.Map[Int,Seq[Int]] = Map(1 -> List(1, 11, 111), 2 -> List(2, 22), 3 -> List(33, 333), 5 -> List(55, 5, 5555))  scala> val a = Map (1 -> Seq(1, 101), 2 -> Seq(2, 212, 222), 3 -> Seq (3, 3443), 4 -> (44, 4, 41214)) a: scala.collection.immutable.Map[Int,ScalaObject with Equals] = Map(1 -> List(1, 101), 2 -> List(2, 212, 222), 3 -> List(3, 3443), 4 -> (44,4,41214))  scala> (for (x  Seq (a.get (x._1), b.get (x._1)).flatten).filter (_._2.size == 2)  res85: scala.collection.immutable.Map[Int,Seq[ScalaObject with Equals]] = Map(1 -> List(List(1, 101), List(1, 11, 111)), 2 -> List(List(2, 212, 222), List(2, 22)), 3 -> List(List(3, 3443), List(33, 333))) 


回答6:

val fr = Map(1 -> "one", 2 -> "two", 3 -> "three") val en = Map(1 -> "un", 2 -> "deux", 3 -> "trois")  def innerJoin[K, A, B](m1: Map[K, A], m2: Map[K, B]): Map[K, (A, B)] = {   m1.flatMap{ case (k, a) =>      m2.get(k).map(b => Map((k, (a, b)))).getOrElse(Map.empty[K, (A, B)])   } }  innerJoin(fr, en) // Map(1 -> ("one", "un"), 2 -> ("two", "deux"), 3 -> ("three", "trois")): Map[Int, (String, String)] 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!