Haskell: converting a list of (a, b) key-value pairs (with possibly repeated keys) to a list of (a, [b]) grouped by key

前端未结

关注

 6  1340

遥遥无期 2021-01-01 17:13

I\'m a Haskell beginner. Let\'s suppose I want to write a function convertKVList that takes a flat list of key-value pairs, where some of the keys might be rep

6条回答

长情又很酷 (楼主)

2021-01-01 18:01

I suspect that without dipping into mutation and the ST monad, you are unlikely to improve on the Map.fromListWith solution (or substantially equivalent alternatives like using HashMap.fromListWith). I'd just go with that.

Basically, with mutation you can do this grouping in near-linear time by using a mutable hash table with a as the keys and mutable lists of b as the values. Without mutation, however, it's going to be worse, because each insert into a balanced search tree is O(log n); this is because "inserting" means constructing a new copy of each tree node that leads to the one your inserted element goes in. And you need to do n inserts—which gives you exactly the O(n * log n) bounds that the Map.fromListWith function has. Sorting the association list ahead of time doesn't fundamentally improve this, because sorting is also O(n * log n).

So to improve on O(n * log n), you need data structures with mutation. I just did a quick Google and the best bet would be to implement the standard imperative algorithm using something like the hashtables library (which I've never tried, so I can't vouch for it). To use this you're going to need to understand Control.Monad.ST and Data.STRef. The ST monad is a technique that GHC provides for using mutation "internally" in a pure function—it uses some type system extensions to guarantee that the side effects cannot be observed outside of the functions in question. HaskellWiki has some examples, but it might take some studying and practice to feel comfortable with this one.

The other thing I would recommend, if you feel like you want to understand Data.Map or similar libraries better, is to look at Chris Okasaki's Purely Functional Data Structures book (or his dissertation (PDF) that the book is based on). It's based on Standard ML instead of Haskell, the data structures are not the same, and it can be a bit of a difficult read, but it's a foundational book.

0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...