How to remove elements from a vector in a fast way in Clojure?

百般思念 提交于 2020-01-15 13:21:16

问题


I'm trying to remove elements from a Clojure vector:

Note that I'm using Clojure's operations from Kotlin

val set = PersistentHashSet.create("foo")
val vec = PersistentVector.create("foo", "bar")
val seq = clojure.`core$remove`.invokeStatic(set, vec) as ISeq
val resultVec = clojure.`core$vec`.invokeStatic(seq) as PersistentVector

This is the equivalent of the following Clojure code:

(remove #{"foo"} ["foo" "bar"])

The code works fine but I've noticed that creating a vector from the seq is extrmely slow. I've written a benchmark and these were the results:

| Item count | Remove ms | Remove with converting back to vector ms|
-----------------------------------------------------------------
| 1000       | 51        | 1355                                 |
| 10000      | 71        | 5123                                 |

Do you know how I can convert the seq resulting from the remove operation back to a vector without the harsh performance penalty?

If it is not possible is there an alternative way to perform the remove operation?


回答1:


You could try the complementary operation to remove that returns a vector:

(filterv (complement #{"foo"}) 
         ["foo" "bar"])

Note the use of filterv. The v indicates that it uses a vector from the start, and returns a vector, so no conversion is required. It uses a transient vector behind the scenes, so it should be pretty fast.

I'm negating the predicate using complement so I can use filterv, since there is no removev. remove is just defined as the complement of filter anyway though, so it's basically what you were already doing, just strict.




回答2:


What you are trying to do fundamentally performs badly. Vectors are for fast indexed read/write, and O(1) access to the right end. To do anything else you must tear the vector apart and rebuild it again, an O(N) operation. If you need an operation like this to be efficient, you must use a different data structure.




回答3:


Why not a PersistentHashSet? Fast removal, though not ordered. I do vaguely recall Clojure also having a sorted set in case that’s needed.




回答4:


You have made an error of accepting the lazy result of remove as equivalent to the concrete result of converting back to a vector. Compare the lazy result of (remove ...) with the concrete result implied by (count (remove ...)). You will see that it is slightly slower than just doing (vec (remove ...)). Also, for real speed-critical applications, there is nothing like using a native Java ArrayList:

(ns tst.demo.core
  (:require
    [criterium.core :as crit]    )
  (:import [java.util ArrayList]))

(def N 1000)
(def tgt-item (/ N 2))

(def pred-set #{ (long tgt-item) })
(def data-vec (vec (range N)))

(def data-al (ArrayList. data-vec))
(def tgt-items (ArrayList. [tgt-item]))


(println :lazy)
(crit/quick-bench
  (remove pred-set data-vec))

(println :lazy-count)
(crit/quick-bench
  (count (remove pred-set data-vec)))

(println :vec)
(crit/quick-bench
  (vec (remove pred-set data-vec)))

(println :ArrayList)
(crit/quick-bench
  (let [changed? (.removeAll data-al tgt-items)]
    data-al)) 

with results:

:lazy           Evaluation count : 35819946     time mean :    10.856 ns 
:lazy-count     Evaluation count :     8496     time mean : 69941.171 ns 
:vec            Evaluation count :     9492     time mean : 62965.632 ns 
:ArrayList      Evaluation count :   167490     time mean :  3594.586 ns


来源:https://stackoverflow.com/questions/48608796/how-to-remove-elements-from-a-vector-in-a-fast-way-in-clojure

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!