Haskell, Scala, Clojure, what to choose for high performance pattern matching and concurrency [closed]

心已入冬 提交于 2019-12-04 07:35:27

问题


I have started work on FP recently after reading a lot of blogs and posts about advantages of FP for concurrent execution and performance. My need for FP has been largely influenced by the application that I am developing, My application is a state based data injector into another subsystem where timing is very crucial (close to a 2 million transactions per sec). I have a couple of such subsystems which needs to be tested. I am seriously considering using FP for its parallelism and want to take the correct approach, many posts on SO talk about disadvantages and advantages of Scala, Haskell and Clojure wrt language constructs, libraries and JVM support. From a language point of view I am ok to learn any language as long as it will help me achieve the result.

Certain posts favor Haskell for pattern matching and simplicity of language, JVM based FP lang have a big advantage with respect to using existing java libraries. JaneStreet is a big OCAML supporter but I am really not sure about developer support and help forums for OCAML.

If anybody has worked with handling such large data, please share your experience.


回答1:


Do you want fast or do you want easy?

If you want fast, you should use C++, even if you're using FP principles to aid in correctness. Since timing is crucial, the support for soft (and hard, if need be) real-time programming will be important. You can decide exactly how and when you have time to recover memory, and spend only as much time as you have on that task.

The three languages you've stated all are ~2-3x slower than near-optimally hand-tuned C++ tends to be, and then only when used in a rather traditional imperative way. They all use garbage collection, which will introduce uncontrolled random delays in your transactions.

Now, that said, it's a lot of work to get this running in bulletproof fashion with C++. Applying FP principles requires considerably more boilerplate (even in C++11), and most libraries are mutable by default. (Edit: Rust is becoming a good alternative, but it is beyond the scope of this answer to describe Rust in sufficient detail.)

Maybe you don't have the time and can afford to scale back on other specifications. If it is not timing but throughput that is crucial, for example, then you probably want Scala over Clojure (see the Computer Languages Benchmark Game, where Scala wins every benchmark as of this writing and has lower code size in almost every case (Edit: CLBG is not helpful in this regard any more, though you may find archives supporting these statements on the Web Archive)); OCaml and Haskell should be chosen for other reasons (similar benchmark scores, but they have different syntax and interoperability and so on).

As far as which system has the best concurrency support, Haskell, Clojure and Scala are all just fine while OCaml is a bit lacking.

This pretty much narrows it down to Haskell and Scala. Do you need to use Java libraries? Scala. Do you need to use C libraries? Probably Haskell. Do you need neither? Then you can choose either on the basis of which one you prefer stylistically without having to worry overly much that you've made your life vastly harder by choosing the wrong one.




回答2:


I've done this with Clojure, which proved pretty effective for the following reasons:

  • Being on the JVM is a huge advantage in terms of libraries. This effectively ruled out Haskell and Ocaml for my purposes, as we needed easy access to the Java ecosystem and integration with JVM based tools (Maven build etc.)
  • You can drop into pure Java if you need to tightly optimise inner loops. We did this for some custom code processing large double[] arrays, but 99% of the time Clojure can get you the performance you need. See http://www.infoq.com/presentations/Why-Prismatic-Goes-Faster-With-Clojure for some examples of how to make Clojure go really fast (quite technical video, assumes some prior knowledge!). Once you start counting the ease of exploiting multiple cores, Clojure is very competitive on performance.
  • Clojure has very nice multi-core concurrency support. This proved extremely useful for managing concurrent tasks. See http://www.infoq.com/presentations/Value-Identity-State-Rich-Hickey
  • The REPL makes a very good environment for testing and exploratory work on data.
  • Clojure is lazy which makes it suitable for handling larger-than-memory data sets (assuming you are careful not to try and force the whole data set into memory at once). There are also some nice libraries available in such an environment, most notable are Storm and Aleph. Storm may be particularly interesting for you, as it's designed for distributed realtime processing of large numbers of events.

I can't speak with quite so much experience of the other languages, but my impression from some practical experience of Haskell and Scala is:

  • Haskell is great if you care about purity and strict functional programming with static types. The static typing can be a strong guarantee of correctness so might make this suitable for highly algorithmic work. Personally, I find pure FP a little too rigid - there are many times when mutable state is useful and I think Clojure has a slightly better balance here (by allowing controlled muability thorugh managed references).
  • Scala is a great language and shares with Clojure the advantages of being on the JVM. To me Scala is more like a "better Java" with functional features and a very impressive type system. It's less of a paradigm shift from Clojure. Downside is that the type system can get quite complex / confusing.

Overall, I think you could be happy with any of these. It will probably come down to how much you care about the JVM and your view on type systems.



来源:https://stackoverflow.com/questions/11607020/haskell-scala-clojure-what-to-choose-for-high-performance-pattern-matching-an

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!