Tuples duplicate elimination from a list

人盡茶涼 提交于 2019-12-01 09:14:33

You could try it with a set, but you need to declare your own tuple class to make it work.

case class MyTuple[A](t: (A, A)) {
  override def hashCode = t._1.hashCode + t._2.hashCode
  override def equals(other: Any) = other match {
    case MyTuple((a, b)) => a.equals(t._1) && b.equals(t._2) || a.equals(t._2) && b.equals(t._1)
    case _ => false
  }
}

val input= List(("A","B"), 
                ("C","B"), 
                ("B","A"))

val output = input.map(MyTuple.apply).toSet.toList.map((mt: MyTuple[String]) => mt.t)
println(output)

edit: Travis's answer made me realise that there is a nicer way to do this. And that is by writing a distinctBy method that works analog to sortBy.

implicit class extList[T](list: List[T]) {
  def distinctBy[U](f: T => U): List[T] = {
    var set = Set.empty[U]
    var result = List.empty[T]
    for(t <- list) {
      val u = f(t)
      if(!set(u)) {
        result ::= t
        set += u
      }
    }
    result.reverse
  }
}

println(input.distinctBy { case (a, b) => Set((a,b), (b,a)) })

We can use a Set to keep track of elements that we have seen already, while using filter to eliminate duplicates:

def removeDuplicates[T](l: List[(T, T)]) = {
  val set = scala.collection.mutable.Set[(T, T)]()
  l.filter { case t@(x, y) =>
    if (set(t)) false else {
      set += t
      set += ((y, x))
      true
    }
  }
}

When we find a tuple we haven't seen before, we put both it and and it with its elements swapped into the set.

On the same lines as SpiderPig's answer, here's a solution that makes no use of sets (since going through a set doesn't preserve the order of the original list, which could be an annoiance)

case class MyPimpedTuple(t: Tuple2[String, String]) {
  override def hashCode = t._1.hashCode + t._2.hashCode
  override def equals(other: Any) = other match {
      case MyPimpedTuple((a, b)) => a.equals(t._1) && b.equals(t._2) || a.equals(t._2) && b.equals(t._1)
      case _ => false
  }
}

val input = List[MyPimpedTuple](("A","B"), ("C","B"),("B","A"))

input.map(MyPimpedTuple(_)).distinct.map(_.t)

Example

val input = List(("A","B"), ("C","B"),("B","A"))
//> input: List[(String, String)] = List((A,B), (C,B), (B,A))

val distinctTuples = input.map(MyPimpedTuple(_)).distinct.map(_.t)
//> distinctTuples: List[(String, String)] = List((A,B), (C,B))

For the sake of completeness, it's possible to do this very simply in a purely functional way with a fold (manually defining equality makes me nervous and I'm not sure mutability buys you much here):

def distinctPairs[A](xs: List[(A, A)]) = xs.foldLeft(List.empty[(A, A)]) {
  case (acc, (a, b)) if acc.contains((a, b)) || acc.contains((b, a)) => acc
  case (acc, p) => acc :+ p
}

This isn't very efficient, since it's searching the list twice for each item (and appending to the list), but that's not too hard to fix:

def distinctPairs[A](xs: List[(A, A)]) = xs.foldLeft(
  (List.empty[(A, A)], Set.empty[(A, A)])
) {
  case (current @ (_, seen), p) if seen(p) => current
  case ((acc, seen), p @ (a, b)) => (p :: acc, seen ++ Set((a, b), (b, a)))
}._1.reverse

Both of these implementations maintain order.

Consider also relying on unique keys on Map, where keys are sets of duple elements,

def uniq[A](a: List[(A,A)]) = a.map( t => Set(t._1,t._2) -> t ).toMap.values

Not the most efficient, yet simple enough; valid for small collections.

Yes I would also suggest a set as the target data structure because the set lookup could be more efficient then two for loops. (Sorry I am a clojure guy and surely this is not the shortest version in clojure...)

(def data `(("A" "B") ("B" "C") ("B" "A")))
;;(def data `(("A" "B") ("B" "C") ("B" "A") ("C" "D") ("C" "B") ("D" "F")))

(defn eliminator [source]
 (println "Crunching: " source)
  (loop [s source t '#{}]
    (if (empty? s) (reverse t) ;; end
      (if (contains? t (list (last (first s)) (first (first s)))) ;reverse is in set !
        (recur (rest s) t) ; next iteration
        (recur (rest s) (conj t (first s))))))) ;; add it
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!