How does concurrency using immutable/persistent types and data structures work?

前端未结

关注

 1  1180

I\'ve read a lot about functional languages recently. Since those use only immutable structures they claim that concurrency issues are greatly improved/solved. I\'m having s

相关标签:

1条回答

不思量自难忘°

2020-12-23 18:26
Immutable values have several applications for which they are well suited. The concurrent/parallel processing is just one of them that got more important recently. The following is really the most basic digest from experience and many books and talks about the subject. You may need to dive into some eventually.

The main example you show here is about managing global state, so it cannot be done purely "immutably". However, even here there are very good reasons to use immutable data structures. Some of those from top of my head:
- try - catch behaves much better, because you do not modify shared object that may be left modified half-way, with immutable values, it automatically keeps the last consistent state
- reducing changing state to just multicore safe "compare-and-swap" operations on very limited set of global variables (ideally one), completely eliminates deadlocks
- free passing of data structures without any defensive copying that is quite often case of mysterious bugs when forgotten (many times defensive copies are created in both calling and called functions, because developers start to incline to "better safe than sorry" after a couple of debugging sessions)
- much easier unit testing, because many functions operating on immutable values are side-effect free
- usually easier serialization and more transparent comparison semantics much easier debugging and taking (logging) a current snapshot of the system state even asynchronously
Back to your question though.

In the most trivial case the global state in this case is often modelled using one mutable reference at the top holding onto an immutable data structure.

The reference is updated only by CAS atomic operation.

The immutable data structure is transformed by a side-effect free functions and when all transformations are done the reference is swapped atomically.

If two threads/cores want to swap simultaneously new values got from the same old one, the one doing that first wins the other does not succeed (CAS semantics) and needs to repeat the operation (depends on the transformation, either updating the current one with the new value, or transforming the new value from the beginning). This may seem wasteful, but the assumption here is that redoing some work is often cheaper than permanent locking/synchronization overhead.

Of course this can be optimized e.g. by partitioning independent parts of immutable data structures to further reduce potential collisions by having several references being updated independently.

Access to the data structure is lock-free and very fast and always gives a consistent response. Edge cases like when you send an update and another client receives older data afterwards is to be expected in any system, because of network requests can get out of order too...

STM is useful quite rarely and usually you are better of to use atomic swaps of data structure containing all values from references you would use in STM transaction.
0 讨论(0)
发布评论:

提交评论
- 加载中...