Algorithm to detect overlapping rows of two images

后端未结

关注

 5  1997

借酒劲吻你

Let\'s say I have 2 images A and B as below.

$\"enter$

Notice that the bottom of

相关标签:

5条回答

[愿得一人]

2020-12-14 13:55
If I understood correctly, you probably want to look at normalized cross correlation of greyscale versions of the two images. Where you have large images, or large overlapping regions, this is done most efficiently in the frequency domain using the FFTs of the images (or overlap areas) and is called phase correlation.

The basic steps I would take in your situation are as follows:
1. Extract the bottom half of the first image and the top half of the second image.
2. Convert both image patches to greyscale.
3. Perform FFT on each image patch (there are some details here relating to windowing and padding).
4. Calculate the complex conjugate of the two FFTs (same as correlation in spatial domain).
5. Do inverse FFT on the result.
6. Find the peak in the above to get the XY shift that best aligns the two images.
Having found the relative offset between the top and bottom image patches, you can easily calculate n as you required.

If you want to experiment without having to code the above from scratch, OpenCV has a number of functions for template matching, which you can easily try. See here for details.

If part of either image has been changed - e.g. by a banner ad - the above procedure still gives the best match, and the magnitude of the peak you find in step 6 gives an indication of the match "confidence" - so you can get a rough idea of how similar the two areas are.
0 讨论(0)
发布评论:

提交评论
- 加载中...

执念已碎

2020-12-14 13:56

I had a little play at doing this with ImageMagick. Here is the animation of what I did, and the explanation and code follow.

enter image description here

First I grabbed a couple of StackOverflow pages, using webkit2png, calling them a.png and b.png.

Then I cropped a rectangle out of the top-left of b.png and a column the same width, but the full height out of a.png

That gave me this:

enter image description here

and this

enter image description here

I now overlay the smaller rectangle from the second page onto the bottom of the strip from the first page. I then calculate the difference between the two images by subtracting one from the other and note that when the difference is zero, the pictures must be the same, and the output image will be black, so I have found the point at which they overlap.

Here is the code:

#!/bin/bash
# Grab page 2 as "A" and page 3 as "B"
# webkit2png -F -o A http://stackoverflow.com/questions?page=2&sort=newest
# webkit2png -F -o B http://stackoverflow.com/questions?page=3&sort=newest

BLOBH=256  # blob height
BLOBW=256  # blob width

# Get height of x.png
XHEIGHT=$(identify -format "%h" x.png)

# Crop a column 256 pixels out of a.png that doesn't contain adverts or junk, into x.png
convert a.png -crop ${BLOBW}x+0+0 x.png

# Crop a rectangle 256x256 pixels out of top left corner of b.png, into y.png
convert b.png -crop ${BLOBW}x${BLOBH}+0+0 y.png

# Now slide y.png up across x.png, starting at the bottom of x.png
# ... differencing the two images as we go
# ... stop when the difference is nothing, i.e. they are the same and difference is black image
lines=0
while :; do
   OFFSET=$((XHEIGHT-BLOBH-1-lines))
   if [ $OFFSET -lt 0 ]; then exit; fi
   FN=$(printf "out-%04d.png" $lines)
   diff=$(convert x.png -crop ${BLOBW}x${BLOBH}+0+${OFFSET} +repage \
           y.png \
           -fuzz 5% -compose difference -composite +write $FN \
           \( +clone -evaluate set 0 \) -metric AE -compare -format "%[distortion]" info:)
   echo $diff:$lines
   ((lines++))
done
n=$((BLOBH+lines))

0 讨论(0)

故里飘歌

2020-12-14 13:59

If rows match exactly, then sort rows in both images and merge. Your duplicates are right there. Then go to the original images and find the longest contiguous streak of duplicates in A, such that the corresponding rows in B are also contiguous. Or just look near the top and the bottom of corresponding images.

If there are banner ads, the first thing that comes to mind is breaking the images into several vertical strips and doing that with each pair of strips separately.

0 讨论(0)
发布评论:

提交评论
- 加载中...

北海茫月

2020-12-14 14:11

The FFT solution might be more complex than you were hoping for. For a general problem, that might be the only robust way.

For a simple solution, you need to start making assumptions. For example, can you guarantee that the columns of the images line up (barring the noted changes)? This allows you to go down the path suggested by @n.m.

Can you cut the image into vertical strips, and consider a row matches if a sufficient proportion of the strips match?

[ This could be redone to use a few passes with difference column offsets if we need to be robust to that.]

This gives something like:

class Image
{
public:
    virtual ~Image() {}
    typedef int Pixel;
    virtual Pixel* getRow(int rowId) const = 0;
    virtual int getWidth() const = 0;
    virtual int getHeight() const = 0;
};

class Analyser
{
    Analyser(const Image& a, const Image& b)
        : a_(a), b_(b) {}
    typedef Image::Pixel* Section;
    static const int numStrips = 16;
    struct StripId
    {
        StripId(int r = 0, int c = 0)
            : row_(r), strip_(c)
        {}
        int row_;
        int strip_;
    };
    typedef std::unordered_map<unsigned, StripId> StripTable;
    int numberOfOverlappingRows()
    {
        int commonWidth = std::min(a_.getWidth(), b_.getWidth());
        int stripWidth = commonWidth/numStrips;
        StripTable aHash;
        createStripTable(aHash, a_, stripWidth);
        StripTable bHash;
        createStripTable(bHash, b_, stripWidth);
        // This is the position that the bottom row of A appears in B.
        int bottomOfA = 0;
        bool canFindBottomOfAInB = canFindLine(a_.getRow(a_.getHeight() - 1), bHash, stripWidth,  bottomOfA);
        int topOfB= 0;
        bool canFindTopOfBInA =  canFindLine(b_.getRow(0), aHash, stripWidth, topOfB);
        int topOFBfromBottomOfA = a_.getHeight() - topOfB;
        // Expect topOFBfromBottomOfA == bottomOfA
        return bottomOfA;
    }
    bool canFindLine(Image::Pixel* source, StripTable& target, int stripWidth, int& matchingRow)
    {
        Image::Pixel* strip = source;
        std::map<int, int> matchedRows;
        for(int index = 0; index < stripWidth; ++index)
        {
            Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);      
            bool match =  target.count(hashValue) > 0;
            if (match)
            {
                ++matchedRows[target[hashValue].row_];
            }
            strip += stripWidth;
        }
        // Can set a threshold requiring more matches than 0
        if (matchedRows.size() == 0)
            return false;
        // FIXME return the most matched row.
        matchingRow = matchedRows.begin()->first;
        return true; 
    }
    Image::Pixel* getStrip(const Image& im, int row, int stripId, int stripWidth)
    {
        return im.getRow(row) + stripId * stripWidth;
    }
    static Image::Pixel getHashOfStrip(Image::Pixel* strip, unsigned width)
    {
        Image::Pixel hashValue = 0;
        for(unsigned col = 0; col < width; ++col)
        {
            hashValue |= *(strip + col);
        }
    }
    void createStripTable(StripTable& hash, const Image& image, int stripWidth)
    {
        for(int row = 0; row < image.getHeight(); ++row)
        {
            for(int index = 0; index < stripWidth; ++index)
            {
                // Warning: Not this simple!
                // If images are sourced from lossy intermediate and hence pixels not _exactly_ the same, need some kind of fuzzy equality here.
                // Details are going to depend on the image format etc, but this is the gist.
                Image::Pixel* strip = getStrip(image, row, index, stripWidth);
                Image::Pixel hashValue = getHashOfStrip(strip,stripWidth);      
                hash[hashValue] = StripId(row, index);
            }
        }
    }

    const Image& a_;
    const Image& b_;

};

0 讨论(0)

执笔经年

2020-12-14 14:11

Something like this will probably help:

First, traverse the image A from bottom upwards, search for a row with significant information in it. An "information" can be calculated, for example, by counting the total color shift across the row. Say, two adjacent pixels have colors #ffffff and #ff0000 - add 2.0 to total count. Have a series of thresholds ready, and lock on the first row that's reaching that threshold. The series can be "10.0, 0.1*row length, 0.15*row length, ..." to a reasonable limit. Then, traverse this array from topmost discovered downwards, take the corresponding row and search for its match in B from upside down. If found, and the threshold is big enough, take the next one in the array and calculate the position of its match, and compare. If succeed, you have locked a correct offset of B over A, and it equals height_of_A - first_row_index + first_row_match_index. If failed continue searching for the next row. If all matches failed, search for very last row of A from the very first row of B, up to the offset of the first row found from the bottom of A. If again failed, then the answer is 0. Of course, if using JPEG images, use threshold-match, as pixels might not be exact in A and B, perhaps with a tolerance to unmatched pixels as well.

0 讨论(0)
发布评论:

提交评论
- 加载中...