Better alternative to pmap in Clojure for parallelizing moderately inexpensive functions over big data?

后端 未结 4 646
离开以前
离开以前 2020-12-24 12:17

Using clojure I have a very large amount of data in a sequence and I want to process it in parallel, with a relatively small number of cores (4 to 8).

The ea

相关标签:
4条回答
  • 2020-12-24 12:33

    Sadly not a valid answer yet, but something to watch for in the future is Rich's work with the fork/join library coming in Java 7. If you look at his Par branch on github he's done some work with it, and last I had seen the early returns were amazing.

    Example of Rich trying it out.

    http://paste.lisp.org/display/84027

    0 讨论(0)
  • 2020-12-24 12:40

    This question: how-to-efficiently-apply-a-medium-weight-function-in-parallel also addresses this problem in a very similar context.

    The current best answer is to use partition to break it into chunks. then pmap a map function onto each chunk. then recombine the results. map-reduce-style.

    0 讨论(0)
  • 2020-12-24 12:53

    The fork/join work mentioned in earlier answers on this and similar threads eventually bore fruit as the reducers library, which is probably worth a look.

    0 讨论(0)
  • 2020-12-24 12:59

    You can use some sort of map/reduce implemented by hand. Also take a look at swarmiji framework.

    "A distributed computing system that helps writing and running Clojure code in parallel - across cores and processors"

    0 讨论(0)
提交回复
热议问题