Let\'s say I have 2 images A and B as below.
Notice that the bottom of
If I understood correctly, you probably want to look at normalized cross correlation of greyscale versions of the two images. Where you have large images, or large overlapping regions, this is done most efficiently in the frequency domain using the FFTs of the images (or overlap areas) and is called phase correlation.
The basic steps I would take in your situation are as follows:
Having found the relative offset between the top and bottom image patches, you can easily calculate n as you required.
If you want to experiment without having to code the above from scratch, OpenCV has a number of functions for template matching, which you can easily try. See here for details.
If part of either image has been changed - e.g. by a banner ad - the above procedure still gives the best match, and the magnitude of the peak you find in step 6 gives an indication of the match "confidence" - so you can get a rough idea of how similar the two areas are.
I had a little play at doing this with ImageMagick. Here is the animation of what I did, and the explanation and code follow.
First I grabbed a couple of StackOverflow pages, using webkit2png
, calling them a.png
and b.png
.
Then I cropped a rectangle out of the top-left of b.png
and a column the same width, but the full height out of a.png
That gave me this:
and this
I now overlay the smaller rectangle from the second page onto the bottom of the strip from the first page. I then calculate the difference between the two images by subtracting one from the other and note that when the difference is zero, the pictures must be the same, and the output image will be black, so I have found the point at which they overlap.
Here is the code:
#!/bin/bash
# Grab page 2 as "A" and page 3 as "B"
# webkit2png -F -o A http://stackoverflow.com/questions?page=2&sort=newest
# webkit2png -F -o B http://stackoverflow.com/questions?page=3&sort=newest
BLOBH=256 # blob height
BLOBW=256 # blob width
# Get height of x.png
XHEIGHT=$(identify -format "%h" x.png)
# Crop a column 256 pixels out of a.png that doesn't contain adverts or junk, into x.png
convert a.png -crop ${BLOBW}x+0+0 x.png
# Crop a rectangle 256x256 pixels out of top left corner of b.png, into y.png
convert b.png -crop ${BLOBW}x${BLOBH}+0+0 y.png
# Now slide y.png up across x.png, starting at the bottom of x.png
# ... differencing the two images as we go
# ... stop when the difference is nothing, i.e. they are the same and difference is black image
lines=0
while :; do
OFFSET=$((XHEIGHT-BLOBH-1-lines))
if [ $OFFSET -lt 0 ]; then exit; fi
FN=$(printf "out-%04d.png" $lines)
diff=$(convert x.png -crop ${BLOBW}x${BLOBH}+0+${OFFSET} +repage \
y.png \
-fuzz 5% -compose difference -composite +write $FN \
\( +clone -evaluate set 0 \) -metric AE -compare -format "%[distortion]" info:)
echo $diff:$lines
((lines++))
done
n=$((BLOBH+lines))
If rows match exactly, then sort rows in both images and merge. Your duplicates are right there. Then go to the original images and find the longest contiguous streak of duplicates in A, such that the corresponding rows in B are also contiguous. Or just look near the top and the bottom of corresponding images.
If there are banner ads, the first thing that comes to mind is breaking the images into several vertical strips and doing that with each pair of strips separately.
The FFT solution might be more complex than you were hoping for. For a general problem, that might be the only robust way.
For a simple solution, you need to start making assumptions. For example, can you guarantee that the columns of the images line up (barring the noted changes)? This allows you to go down the path suggested by @n.m.
Can you cut the image into vertical strips, and consider a row matches if a sufficient proportion of the strips match?
[ This could be redone to use a few passes with difference column offsets if we need to be robust to that.]
This gives something like:
class Image
{
public:
virtual ~Image() {}
typedef int Pixel;
virtual Pixel* getRow(int rowId) const = 0;
virtual int getWidth() const = 0;
virtual int getHeight() const = 0;
};
class Analyser
{
Analyser(const Image& a, const Image& b)
: a_(a), b_(b) {}
typedef Image::Pixel* Section;
static const int numStrips = 16;
struct StripId
{
StripId(int r = 0, int c = 0)
: row_(r), strip_(c)
{}
int row_;
int strip_;
};
typedef std::unordered_map<unsigned, StripId> StripTable;
int numberOfOverlappingRows()
{
int commonWidth = std::min(a_.getWidth(), b_.getWidth());
int stripWidth = commonWidth/numStrips;
StripTable aHash;
createStripTable(aHash, a_, stripWidth);
StripTable bHash;
createStripTable(bHash, b_, stripWidth);
// This is the position that the bottom row of A appears in B.
int bottomOfA = 0;
bool canFindBottomOfAInB = canFindLine(a_.getRow(a_.getHeight() - 1), bHash, stripWidth, bottomOfA);
int topOfB= 0;
bool canFindTopOfBInA = canFindLine(b_.getRow(0), aHash, stripWidth, topOfB);
int topOFBfromBottomOfA = a_.getHeight() - topOfB;
// Expect topOFBfromBottomOfA == bottomOfA
return bottomOfA;
}
bool canFindLine(Image::Pixel* source, StripTable& target, int stripWidth, int& matchingRow)
{
Image::Pixel* strip = source;
std::map<int, int> matchedRows;
for(int index = 0; index < stripWidth; ++index)
{
Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);
bool match = target.count(hashValue) > 0;
if (match)
{
++matchedRows[target[hashValue].row_];
}
strip += stripWidth;
}
// Can set a threshold requiring more matches than 0
if (matchedRows.size() == 0)
return false;
// FIXME return the most matched row.
matchingRow = matchedRows.begin()->first;
return true;
}
Image::Pixel* getStrip(const Image& im, int row, int stripId, int stripWidth)
{
return im.getRow(row) + stripId * stripWidth;
}
static Image::Pixel getHashOfStrip(Image::Pixel* strip, unsigned width)
{
Image::Pixel hashValue = 0;
for(unsigned col = 0; col < width; ++col)
{
hashValue |= *(strip + col);
}
}
void createStripTable(StripTable& hash, const Image& image, int stripWidth)
{
for(int row = 0; row < image.getHeight(); ++row)
{
for(int index = 0; index < stripWidth; ++index)
{
// Warning: Not this simple!
// If images are sourced from lossy intermediate and hence pixels not _exactly_ the same, need some kind of fuzzy equality here.
// Details are going to depend on the image format etc, but this is the gist.
Image::Pixel* strip = getStrip(image, row, index, stripWidth);
Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);
hash[hashValue] = StripId(row, index);
}
}
}
const Image& a_;
const Image& b_;
};
Something like this will probably help:
First, traverse the image A from bottom upwards, search for a row with significant information in it. An "information" can be calculated, for example, by counting the total color shift across the row. Say, two adjacent pixels have colors #ffffff and #ff0000 - add 2.0 to total count. Have a series of thresholds ready, and lock on the first row that's reaching that threshold. The series can be "10.0, 0.1*row length, 0.15*row length, ..." to a reasonable limit. Then, traverse this array from topmost discovered downwards, take the corresponding row and search for its match in B from upside down. If found, and the threshold is big enough, take the next one in the array and calculate the position of its match, and compare. If succeed, you have locked a correct offset of B over A, and it equals height_of_A - first_row_index + first_row_match_index
. If failed continue searching for the next row. If all matches failed, search for very last row of A from the very first row of B, up to the offset of the first row found from the bottom of A. If again failed, then the answer is 0. Of course, if using JPEG images, use threshold-match, as pixels might not be exact in A and B, perhaps with a tolerance to unmatched pixels as well.