Detecting rectangles (sub-arrays of same element value) in 2-d list

问题

A rectangle is defined as any rectangular-shaped section of zeros within a 2-d array of 1s and 0s. Typical example:

[
  [1, 1, 1, 1, 1, 1, 1, 1, 1],
  [1, 1, 1, 1, 1, 1, 1, 1, 0],
  [1, 1, 1, 0, 0, 0, 1, 0, 0],
  [1, 0, 1, 0, 0, 0, 1, 0, 0],
  [1, 0, 1, 1, 1, 1, 1, 1, 1],
  [1, 0, 1, 0, 0, 1, 1, 1, 1],
  [1, 1, 1, 0, 0, 1, 1, 1, 1],
  [1, 1, 1, 1, 1, 1, 1, 1, 1],
]

In this example, there are three such arrays:

My goal is to determine the coordinates (outer 3 extremeties) of each array.

I start by converting the 2-d list into a numpy array:

image_as_np_array = np.array(two_d_list)

I can then get the coordinates of all the zeros thus:

np.argwhere(image_as_np_array == 0)

But this merely provides a shortcut to getting the indices by iterating over each row and calling .index(), then combining with the index of that row within the 2-d list.

I envisage now doing something like removing any element of np.argwhere() (or np.where()) where there is only a single occurrence of a 0 (effectively disregarding any row that cannot form part of a rectangle), and then trying to align contiguous coordinates, but I'm stuck with figuring out how to handle cases where any row may contain part of more than just one single rectangle (as is the case in the 3rd and 4th rows above). Is there a numpy function or functions I can leverage?

回答1:

I don't know numpy, so here's a plain Python solution:

from collections import namedtuple

Rectangle = namedtuple("Rectangle", "top bottom left right")

def find_rectangles(arr):

    # Deeply copy the array so that it can be modified safely
    arr = [row[:] for row in arr]

    rectangles = []

    for top, row in enumerate(arr):
        start = 0

        # Look for rectangles whose top row is here
        while True:
            try:
                left = row.index(0, start)
            except ValueError:
                break

            # Set start to one past the last 0 in the contiguous line of 0s
            try:
                start = row.index(1, left)
            except ValueError:
                start = len(row)

            right = start - 1

            if (  # Width == 1
                  left == right or
                  # There are 0s above
                  top > 0 and not all(arr[top-1][left:right + 1])):
                continue

            bottom = top + 1
            while (bottom < len(arr) and
                   # No extra zeroes on the sides
                   (left == 0 or arr[bottom][left-1]) and
                   (right == len(row) - 1 or arr[bottom][right + 1]) and
                   # All zeroes in the row
                   not any(arr[bottom][left:right + 1])):
                bottom += 1

            # The loop ends when bottom has gone too far, so backtrack
            bottom -= 1

            if (  # Height == 1
                  bottom == top or
                  # There are 0s beneath
                  (bottom < len(arr) - 1 and
                   not all(arr[bottom + 1][left:right+1]))):
                continue

            rectangles.append(Rectangle(top, bottom, left, right))

            # Remove the rectangle so that it doesn't affect future searches
            for i in range(top, bottom+1):
                arr[i][left:right+1] = [1] * (right + 1 - left)

    return rectangles

For the given input, the output is:

[Rectangle(top=2, bottom=3, left=3, right=5),
 Rectangle(top=5, bottom=6, left=3, right=4)]

This is correct because the comments indicate that the 'rectangle' on the right is not to be counted since there is an extra 0 sticking out. I suggest you add more test cases though.

I expect it to be reasonably fast since much of the low-level iteration is done with calls to index and any, so there's decent usage of C code even without the help of numpy.

回答2:

I have written a simple algorithms using the Sweep line method. The idea is that You go through the columns of You array column by column, and detect the series of zeros as potentially new rectangles. In each column You have to check if the rectangles detected earlier have ended, and if yes add them to the results.

import numpy as np
from sets import Set
from collections import namedtuple

example = np.array([
    [1, 1, 1, 1, 1, 1, 1, 1, 1],
    [1, 1, 1, 1, 1, 1, 1, 1, 0],
    [1, 1, 1, 0, 0, 0, 1, 0, 0],
    [1, 0, 1, 0, 0, 0, 1, 0, 0],
    [1, 0, 1, 1, 1, 1, 1, 1, 1],
    [1, 0, 1, 0, 0, 1, 1, 1, 1],
    [1, 1, 1, 0, 0, 1, 1, 1, 1],
    [1, 1, 1, 1, 1, 1, 1, 1, 1],
])

Rectangle = namedtuple("Rectangle", "left top bottom right")

def sweep(A):
    height = A.shape[0]
    length = A.shape[1]
    rectangles = dict()  # detected rectangles {(rowstart, rowend): col}
    result = []

    # sweep the matrix column by column
    for i in xrange(length):
        column = A[:, i]

        # for currently detected rectangles check if we should extend them or end
        for r in rectangles.keys():
            # detect non rectangles shapes like requesten in question edit and del those rectangles
            if all([x == 0 for x in column[r[0]:r[1]+1]]) and ((r[0]-1>0 and column[r[0]-1]==0) or (r[1]+1<height and column[r[1]+1]==0)):
                del rectangles[r]
            elif any([x == 0 for x in column[r[0]:r[1]+1]]) and not all([x == 0 for x in column[r[0]:r[1]+1]]):
                del rectangles[r]
            # special case in the last column - add detected rectangles
            elif i == length - 1 and all([x == 0 for x in column[r[0]:r[1]+1]]):
               result.append(Rectangle(rectangles[r], r[0], r[1], i))
            # if detected rectangle is not extended - add to result and del from list
            elif all([x == 1 for x in column[r[0]:r[1]+1]]):
                result.append(Rectangle(rectangles[r], r[0], r[1], i-1))
                del rectangles[r]

        newRectangle = False
        start = 0
        # go through the column and check if any new rectangles appear
        for j in xrange(height):
            # new rectangle in column detected
            if column[j] == 0 and not newRectangle and j+1 < height and column[j+1] == 0:
                start = j
                newRectangle = True
            # new rectangle in column ends
            elif column[j] == 1 and newRectangle:
                # check if new detected rectangle is already on the list
                if not (start, j-1) in rectangles:
                    rectangles[(start, j-1)] = i
                newRectangle = False

    # delete single column rectangles
    resultWithout1ColumnRectangles = []
    for r in result:
        if r[0] != r[3]:
            resultWithout1ColumnRectangles.append(r)
    return resultWithout1ColumnRectangles

print example
print sweep(example)

returns:

[[1 1 1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1 1 0]
 [1 1 1 0 0 0 1 0 0]
 [1 0 1 0 0 0 1 0 0]
 [1 0 1 1 1 1 1 1 1]
 [1 0 1 0 0 1 1 1 1]
 [1 1 1 0 0 1 1 1 1]
 [1 1 1 1 1 1 1 1 1]]
[Rectangle(left=3, top=5, bottom=6, right=4), 
 Rectangle(left=3, top=2, bottom=3, right=5)]

来源：https://stackoverflow.com/questions/36668750/detecting-rectangles-sub-arrays-of-same-element-value-in-2-d-list

标签

python

list

numpy

multidimensional-array

scipy