Recursive set union: how does it work really?

前端 未结 6 2043
梦谈多话
梦谈多话 2020-12-13 14:29

I am currently taking the Scala course on Coursera on my free time after work, in an attempt to finally give a try to functional programming. I am currently working on an as

6条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-13 15:22

    I gather that incl inserts an element into an existing set? If so, that's where all the real work is happening.

    The definition of the union is the set that includes everything in either input set. Given two sets stored as binary trees, if you take the unions of the first set with the branches of the second, the only element in either that could be missing from the result is the element at the root node of the second tree, so if you insert that element you have the union of both input sets.

    It's just a very inefficient way of inserting each element from both sets into a new set which starts out empty. Presumably duplicates are discarded by incl, so the result is the union of the two inputs.


    Maybe it would help to ignore the tree structure for the moment; it's not really important to the essential algorithm. Say we have abstract mathematical sets. Given an input set with unknown elements, we can do two things things:

    • Add an element to it (which does nothing if the element was already present)
    • Check whether the set is non-empty and, if so, decompose it into a single element and two disjoint subsets.

    To take the union of two sets {1,2} and {2,3}, we start by decomposing the first set into the element 1 and subsets {} and {2}. We recursively take the union of {}, {2}, and {2,3} using the same process, then insert 1 into the result.

    At each step, the problem is reduced from one union operation to two union operations on smaller inputs; a standard divide-and-conquer algorithm. When reaching the union of a singleton set {x} and empty set {}, the union is trivially {x}, which is then returned back up the chain.

    The tree structure is just used to both allow the case analysis/decomposition into smaller sets, and to make insertion more efficient. The same could be done using other data structures, such as lists that are split in half for decomposition and with insertion done by an exhaustive check for uniqueness. To take the union efficiently requires an algorithm that's a bit more clever, and takes advantage of the structure used to store the elements.

提交回复
热议问题