问题
I'm trying to remove elements from a Clojure vector:
Note that I'm using Clojure's operations from Kotlin
val set = PersistentHashSet.create("foo")
val vec = PersistentVector.create("foo", "bar")
val seq = clojure.`core$remove`.invokeStatic(set, vec) as ISeq
val resultVec = clojure.`core$vec`.invokeStatic(seq) as PersistentVector
This is the equivalent of the following Clojure code:
(remove #{"foo"} ["foo" "bar"])
The code works fine but I've noticed that creating a vector from the seq is extrmely slow. I've written a benchmark and these were the results:
| Item count | Remove ms | Remove with converting back to vector ms|
-----------------------------------------------------------------
| 1000 | 51 | 1355 |
| 10000 | 71 | 5123 |
Do you know how I can convert the seq
resulting from the remove
operation back to a vector
without the harsh performance penalty?
If it is not possible is there an alternative way to perform the remove
operation?
回答1:
You could try the complementary operation to remove
that returns a vector:
(filterv (complement #{"foo"})
["foo" "bar"])
Note the use of filterv
. The v
indicates that it uses a vector from the start, and returns a vector, so no conversion is required. It uses a transient
vector behind the scenes, so it should be pretty fast.
I'm negating the predicate using complement so I can use filterv
, since there is no removev
. remove is just defined as the complement of filter anyway though, so it's basically what you were already doing, just strict.
回答2:
What you are trying to do fundamentally performs badly. Vectors are for fast indexed read/write, and O(1) access to the right end. To do anything else you must tear the vector apart and rebuild it again, an O(N) operation. If you need an operation like this to be efficient, you must use a different data structure.
回答3:
Why not a PersistentHashSet? Fast removal, though not ordered. I do vaguely recall Clojure also having a sorted set in case that’s needed.
回答4:
You have made an error of accepting the lazy result of remove
as equivalent to the concrete result of converting back to a vector. Compare the lazy result of (remove ...)
with the concrete result implied by (count (remove ...))
. You will see that it is slightly slower than just doing (vec (remove ...))
. Also, for real speed-critical applications, there is nothing like using a native Java ArrayList
:
(ns tst.demo.core
(:require
[criterium.core :as crit] )
(:import [java.util ArrayList]))
(def N 1000)
(def tgt-item (/ N 2))
(def pred-set #{ (long tgt-item) })
(def data-vec (vec (range N)))
(def data-al (ArrayList. data-vec))
(def tgt-items (ArrayList. [tgt-item]))
(println :lazy)
(crit/quick-bench
(remove pred-set data-vec))
(println :lazy-count)
(crit/quick-bench
(count (remove pred-set data-vec)))
(println :vec)
(crit/quick-bench
(vec (remove pred-set data-vec)))
(println :ArrayList)
(crit/quick-bench
(let [changed? (.removeAll data-al tgt-items)]
data-al))
with results:
:lazy Evaluation count : 35819946 time mean : 10.856 ns
:lazy-count Evaluation count : 8496 time mean : 69941.171 ns
:vec Evaluation count : 9492 time mean : 62965.632 ns
:ArrayList Evaluation count : 167490 time mean : 3594.586 ns
来源:https://stackoverflow.com/questions/48608796/how-to-remove-elements-from-a-vector-in-a-fast-way-in-clojure