Computing unique intersections of subsets

被刻印的时光 ゝ 提交于 2019-12-23 18:20:16

问题


Given a set S = {si : {zj : z ∈ N} }, what is a time-efficient algorithm for computing the unique sets of intersections of the subsets of S?

For background, I am dealing with several versions of this problem, some larger than others. In the smallest one |S| ≈ 1,000, |si| ≈ 10,000 and the values are zip codes.

Tiny example for clarity:

Input: S = {{},{1},{2,3},{3,4,5,6,7,8,9,10}}
Output: {{},{1},{2,3},{3,4,5,6,7,8,9,10},{3}}

|S| = 4 and there are 24 = 16 subsets of S. However, there are only five unique sets of subset intersections. The first four are the members of S themselves. The fifth is {3}. The empty set is already a member of S. All other 10 subset intersections produce empty sets also.


回答1:


Here's a fast preprocessing step that might be worthwhile.

Call elements x and y equivalent if, for every set si, either both or neither of x and y belong to si. Simplify the problem by deleting all elements except one representative of each equivalence class. At the end, expand each representative to its class.

To simplify, scan sets one by one while maintaining a map from each element to a label for its equivalence class, where equivalence is determined with respect to the sets processed so far. Initially, all elements are in the same class, so this map sends each element to the same label. To process a set, initialize an empty map of new labels. For each element x in the set, let k be x's current label. If the key k exists in the new label map, then the value corresponding to k becomes x's new label. Otherwise, we allocate a new label and assign it to x and add a mapping from k to the new label.

Here's the execution on your input.

  1. Initially label = {1: a, 2: a, 3: a, 4: a, 5: a, 6: a, 7: a, 8: a, 9: a, 10: a}.
  2. Scan {}. Nothing happens.
  3. Scan {1}. Change label[1] to b.
  4. Scan {2, 3}. Change label[2] and label[3] to c.
  5. Scan {3, 4, 5, 6, 7, 8, 9, 10}. Change label[3] to d and label[4..10] to e.
  6. At the end, label = {1: b, 2: c, 3: d, 4: e, 5: e, 6: e, 7: e, 8: e, 9: e, 10: e}. Select 1 (b) and 2 (c) and 3 (d) and 4 (e) to represent their classes. All others are deleted.



回答2:


Asymptotic Time Complexity:

n: number of sets, which will change during execution

m: average set size

Time: T(Search-Matching-Sets) x T(Intersection) x SUM { k } for k: 1..n

Time: 2m x 2m x 1/2 n(n+1)

Time: O(4 m^2 x 1/2 n x (n+1)) ~ O(n^2 m^2)

Proposed Algorithm:

FOR i: 1..n
{

    FOR j: i..n
    {

        IF i = j THEN SKIP

        INTERSECTION := FIND-INTERSECTION ( SETS[i], SETS[j] )

        IF INTERSECTION ⊄ SETS[] THEN ADD INTERSECTION TO SETS[]

    }

}


来源:https://stackoverflow.com/questions/15803808/computing-unique-intersections-of-subsets

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!