问题
I want to know how to wrap a function (or function definition) such that it becomes seemingly agnostic to whether the parameters passed to it are mutable or immutable -- but if any parameter it is given is mutable, it should dereference that parameter every time it is called, to get the current value.
I could write a function that requires each parameter to be mutable storage that it then dereferences each time it is called. But there's a performance hit (very, very small, I know!) to dereferencing mutable storage in Clojure. In my specific use case these actually are bottleneck operations, small enough for dereferencing to make a difference, and repeated hundreds of thousands to millions of times (more on my use case below, but for now let's just assume this is significant). So I don't want to use mutable data in the cases where I don't need it to be mutable. It would be nice if, from the outside, the code appeared not to care whether the initial parameters were mutable or immutable. Let's say, for simplicity's sake, that the function is the following:
(defn foo [a b]
(fn [x] (* a b x)))
(def instance (foo 3 4))
(instance 5) ; <- 60
(instance 8) ; <- 96
I would like a foo
that is smart enough to do this:
(def a (agent 3))
(def b (agent 4))
(foo a b) ; <- (fn [x] (* (deref a) (deref b) x))
(foo a 4) ; <- (fn [x] (* (deref a) 4 x))
(foo 3 4) ; <- (fn [x] (* 3 4 x))
However, my first attempt to do something used quoting and unquoting (natural, right? It's what macros use!), and it gave me a nasty error about embedding objects in code (a very similar issue, different use-case, is discussed here). My next attempt gave me a weird (and massive) slowdown in runtimes.
Does anyone know of a good way to do this?
Background
I am working some machine learning algorithms. In a typical scenario, the user would initialize an algorithm with a certain set of parameters, and then run it on a set of data. But sometimes a user/user-defined code might want to modify the parameters as the algorithm is running, either based on time (e.g., simulated annealing), or based on some other criteria determined while watching the algorithm's ongoing performance. My algorithms are parallelized, and each thread would need to see the change. Restarting the algorithm when I'm changing the parameters would defeat the purpose.
回答1:
It seems I answered this question in the related question with the last maybe-deref-expr
example there. That code is repeated in Timothy Dean's own answer here, along with some nice macro sugar he wrote for it, so definitely check out his answer too. Here's a slightly modified version of maybe-deref-expr
, perhaps a bit easier to read.
(defn maybe-deref-expr
[values params body]
(let [valmap (zipmap params values)
deref? #(instance? clojure.lang.IDeref %)
body* (clojure.walk/postwalk
#(if (deref? (valmap %)) `(deref ~%) %)
body)
gen (eval `(fn ~params ~body*))]
(apply gen values)))
With Timothy Dean's macro sugar
(defmacro defn-ref-agnostic
[name params body]
`(defn ~name
~params
(maybe-deref-expr ~params '~params '~body)))
if we do
(defn-ref-agnostic add
[a b]
(+ a b))
Then we get a slow (eval
hit) add
that is automatically dereferences when needed
(add 40 2) ;=> 42
(add (ref 40) (atom 2)) ;=> 42
But, the use case is not to define functions themselves, but function generators that close over other parameters.
(defn-ref-agnostic add-to
[a b]
(fn [x] (+ a b x)))
Now if we do
(def baz1 (add-to 40 2))
(def my-ref (ref 40))
(def my-atom (atom 2))
(def baz2 (add-to my-ref my-atom))
Then we take the eval
hit when baz1
and baz2
are defined, and not when they are subsequently used. The code produced for the definition of baz1
and baz2
, and thus the performance of those when used, is exactly as if we had done
(def baz1 (fn [x] (+ 40 2 x)))
(def baz2 (fn [x] (+ @my-ref @my-atom x)))
That having been said...
The original "Without Eval" solution, if it fits your use case, is what I would prefer:
(defn foo [a b]
(let [[fa fb] (map #(if (instance? clojure.lang.IDeref %)
deref
identity)
[a b])]
(fn [x] (+ (fa a) (fb b) x))))
This introduces an extra level of indirection only at the low, low cost of at most two extra identity function calls. It is a lot simpler than the above and can be very flexible. The main difference between this and the answer to the other related question is that the test/branching has been moved outside the returned function, which now closes over the results.
回答2:
With Eval
To get a foo
smart enough to do what you want, you can use run-time expression modification:
(defn maybe-deref-expr
[vals params body]
(let [smap (zipmap params
(map (fn [val sym]
(if (instance? clojure.lang.IDeref val)
(list 'deref sym)
sym))
vals
params))
body* (clojure.walk/postwalk-replace smap body)
gen (eval (list 'fn params body*))]
(apply gen vals)))
(defmacro defn-ref-agnostic
[name params body]
`(defn ~name
~params
(maybe-deref-expr ~params '~params '~body)))
(defn-ref-agnostic foo
[a b]
(fn [x] (* a b x)))
(defn foo-baseline
[a b]
(fn [x] (* a b x)))
(def f (foo 3 4))
(def g (foo 3 4))
As far as I can tell on my machine, f
and g
have identical performance characteristics.
Without Eval
This appears to be working fairly efficiently:
(defn aref? [x] (instance? clojure.lang.ARef x))
(defn foo-wraps [& args]
(map (fn [i] (if (aref? i)
#(deref i)
#(identity i)))
args))
(defn foo [a b]
(let [[a b] (foo-wraps a b)]
(fn [x] (* (a) (b) x))))
I guess this might be an example of HotSpot coming to the rescue? If I don't pass any ARefs, then the performance is very close to the original formulation after only a handful of runs:
(def a (ref 3))
(def b (ref 4))
(def f (foo 3 4))
(def g (foo a b))
(defn h [x] (* 3 4 x))
user=> (time (dotimes [n 10000] (f n)))
"Elapsed time: 7.38648 msecs"
"Elapsed time: 3.45071 msecs"
"Elapsed time: 3.087424 msecs"
"Elapsed time: 2.836596 msecs"
user=> (time (dotimes [n 10000] (g n)))
"Elapsed time: 13.076024 msecs"
"Elapsed time: 4.235882 msecs"
"Elapsed time: 4.517663 msecs"
"Elapsed time: 3.940946 msecs"
user=> (time (dotimes [n 10000] (h n)))
"Elapsed time: 4.056389 msecs"
"Elapsed time: 2.499129 msecs"
"Elapsed time: 3.064487 msecs"
"Elapsed time: 2.631167 msecs"
来源:https://stackoverflow.com/questions/23032332/using-maybe-arefs-in-higher-order-functions