Creating sets of similar elements in a 2D array

后端未结

关注

 4  1282

渐次进展 2020-12-28 23:51

I am trying to solve a problem that is based on a 2D array. This array contains different kinds of elements (from a total of 3 possible kinds). Lets assume the kind as X, Y,

4条回答

不思量自难忘° (楼主)

2020-12-29 00:39
[EDIT 5/8/2013: Fixed time complexity. (O(a(n)) is essentially constant time!)]

In the following, by "connected component" I mean the set of all positions that are reachable from each other by a path that allows only horizontal, vertical or diagonal moves between neighbouring positions having the same kind of element. E.g. your example {(0,1), (1,1), (2,2), (2,3), (1,4)} is a connected component in your example input. Each position belongs to exactly one connected component.

We will build a union/find data structure that will be used to give every position (x, y) a numeric "label" having the property that if and only if any two positions (x, y) and (x', y') belong to the same component then they have the same label. In particular this data structure supports three operations:
- set(x, y, i) will set the label for position (x, y) to i.
- find(x, y) will return the label assigned to the position (x, y).
- union(Z), for some set of labels Z, will combine all labels in Z into a single label k, in the sense that future calls to find(x, y) on any position (x, y) that previously had a label in Z will now return k. (In general k will be one of the labels already in Z, though this is not actually important.) union(Z) also returns the new "master" label, k.
If there are n = width * height positions in total, this can be done in O(n*a(n)) time, where a() is the extremely slow-growing inverse Ackermann function. For all practical input sizes, this is the same as O(n).

Notice that whenever two vertices are adjacent to each other, there are four possible cases:
1. One is above the other (connected by a vertical edge)
2. One is to the left of the other (connected by a horizontal edge)
3. One is above and to the left of the other (connected by a \ diagonal edge)
4. One is above and to the right of the other (connected by a / diagonal edge)
We can use the following pass to determine labels for each position (x, y):
- Set nextLabel to 0.
- For each row y in increasing order:
  - For each column x in increasing order:
    - Examine the W, NW, N and NE neighbours of (x, y). Let Z be the subset of these 4 neighbours that are of the same kind as (x, y).
    - If Z is the empty set, then we tentatively suppose that (x, y) starts a brand new component, so call set(x, y, nextLabel) and increment nextLabel.
    - Otherwise, call find(Z[i]) on each element of Z to find their labels, and call union() on this set of labels to combine them together. Assign the new label (the result of this union() call) to k, and then also call set(x, y, k) to add (x, y) to this component.
After this, calling find(x, y) on any position (x, y) effectively tells you which component it belongs to. If you want to be able to quickly answer queries of the form "Which positions belong to the connected component containing position (x, y)?" then create a hashtable of lists posInComp and make a second pass over the input array, appending each (x, y) to the list posInComp[find(x, y)]. This can all be done in linear time and space. Now to answer a query for some given position (x, y), simply call lab = find(x, y) to find that position's label, and then list the positions in posInComp[lab].

To deal with "too-small" components, just look at the size of posInComp[lab]. If it's 1 or 2, then (x, y) does not belong to any "large-enough" component.

Finally, all this work effectively takes linear time, so it will be lightning fast unless your input array is huge. So it's perfectly reasonable to recompute it from scratch after modifying the input array.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...