问题
A rectangle is defined as any rectangular-shaped section of zeros within a 2-d array of 1s and 0s. Typical example:
[
[1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1],
]
In this example, there are three such arrays:
My goal is to determine the coordinates (outer 3 extremeties) of each array.
I start by converting the 2-d list into a numpy array:
image_as_np_array = np.array(two_d_list)
I can then get the coordinates of all the zeros thus:
np.argwhere(image_as_np_array == 0)
But this merely provides a shortcut to getting the indices by iterating over each row and calling .index(), then combining with the index of that row within the 2-d list.
I envisage now doing something like removing any element of np.argwhere() (or np.where()) where there is only a single occurrence of a 0 (effectively disregarding any row that cannot form part of a rectangle), and then trying to align contiguous coordinates, but I'm stuck with figuring out how to handle cases where any row may contain part of more than just one single rectangle (as is the case in the 3rd and 4th rows above). Is there a numpy function or functions I can leverage?
回答1:
I don't know numpy, so here's a plain Python solution:
from collections import namedtuple
Rectangle = namedtuple("Rectangle", "top bottom left right")
def find_rectangles(arr):
# Deeply copy the array so that it can be modified safely
arr = [row[:] for row in arr]
rectangles = []
for top, row in enumerate(arr):
start = 0
# Look for rectangles whose top row is here
while True:
try:
left = row.index(0, start)
except ValueError:
break
# Set start to one past the last 0 in the contiguous line of 0s
try:
start = row.index(1, left)
except ValueError:
start = len(row)
right = start - 1
if ( # Width == 1
left == right or
# There are 0s above
top > 0 and not all(arr[top-1][left:right + 1])):
continue
bottom = top + 1
while (bottom < len(arr) and
# No extra zeroes on the sides
(left == 0 or arr[bottom][left-1]) and
(right == len(row) - 1 or arr[bottom][right + 1]) and
# All zeroes in the row
not any(arr[bottom][left:right + 1])):
bottom += 1
# The loop ends when bottom has gone too far, so backtrack
bottom -= 1
if ( # Height == 1
bottom == top or
# There are 0s beneath
(bottom < len(arr) - 1 and
not all(arr[bottom + 1][left:right+1]))):
continue
rectangles.append(Rectangle(top, bottom, left, right))
# Remove the rectangle so that it doesn't affect future searches
for i in range(top, bottom+1):
arr[i][left:right+1] = [1] * (right + 1 - left)
return rectangles
For the given input, the output is:
[Rectangle(top=2, bottom=3, left=3, right=5),
Rectangle(top=5, bottom=6, left=3, right=4)]
This is correct because the comments indicate that the 'rectangle' on the right is not to be counted since there is an extra 0 sticking out. I suggest you add more test cases though.
I expect it to be reasonably fast since much of the low-level iteration is done with calls to index and any, so there's decent usage of C code even without the help of numpy.
回答2:
I have written a simple algorithms using the Sweep line method. The idea is that You go through the columns of You array column by column, and detect the series of zeros as potentially new rectangles. In each column You have to check if the rectangles detected earlier have ended, and if yes add them to the results.
import numpy as np
from sets import Set
from collections import namedtuple
example = np.array([
[1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 0, 0, 0, 1, 0, 0],
[1, 0, 1, 1, 1, 1, 1, 1, 1],
[1, 0, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 0, 0, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1],
])
Rectangle = namedtuple("Rectangle", "left top bottom right")
def sweep(A):
height = A.shape[0]
length = A.shape[1]
rectangles = dict() # detected rectangles {(rowstart, rowend): col}
result = []
# sweep the matrix column by column
for i in xrange(length):
column = A[:, i]
# for currently detected rectangles check if we should extend them or end
for r in rectangles.keys():
# detect non rectangles shapes like requesten in question edit and del those rectangles
if all([x == 0 for x in column[r[0]:r[1]+1]]) and ((r[0]-1>0 and column[r[0]-1]==0) or (r[1]+1<height and column[r[1]+1]==0)):
del rectangles[r]
elif any([x == 0 for x in column[r[0]:r[1]+1]]) and not all([x == 0 for x in column[r[0]:r[1]+1]]):
del rectangles[r]
# special case in the last column - add detected rectangles
elif i == length - 1 and all([x == 0 for x in column[r[0]:r[1]+1]]):
result.append(Rectangle(rectangles[r], r[0], r[1], i))
# if detected rectangle is not extended - add to result and del from list
elif all([x == 1 for x in column[r[0]:r[1]+1]]):
result.append(Rectangle(rectangles[r], r[0], r[1], i-1))
del rectangles[r]
newRectangle = False
start = 0
# go through the column and check if any new rectangles appear
for j in xrange(height):
# new rectangle in column detected
if column[j] == 0 and not newRectangle and j+1 < height and column[j+1] == 0:
start = j
newRectangle = True
# new rectangle in column ends
elif column[j] == 1 and newRectangle:
# check if new detected rectangle is already on the list
if not (start, j-1) in rectangles:
rectangles[(start, j-1)] = i
newRectangle = False
# delete single column rectangles
resultWithout1ColumnRectangles = []
for r in result:
if r[0] != r[3]:
resultWithout1ColumnRectangles.append(r)
return resultWithout1ColumnRectangles
print example
print sweep(example)
returns:
[[1 1 1 1 1 1 1 1 1]
[1 1 1 1 1 1 1 1 0]
[1 1 1 0 0 0 1 0 0]
[1 0 1 0 0 0 1 0 0]
[1 0 1 1 1 1 1 1 1]
[1 0 1 0 0 1 1 1 1]
[1 1 1 0 0 1 1 1 1]
[1 1 1 1 1 1 1 1 1]]
[Rectangle(left=3, top=5, bottom=6, right=4),
Rectangle(left=3, top=2, bottom=3, right=5)]
来源:https://stackoverflow.com/questions/36668750/detecting-rectangles-sub-arrays-of-same-element-value-in-2-d-list