Matching image to images collection

前端 未结 5 605
鱼传尺愫
鱼传尺愫 2020-11-30 15:27

I have large collecton of card images, and one photo of particular card. What tools can I use to find which image of collection is most similar to mine?

Here\'s coll

相关标签:
5条回答
  • 2020-11-30 15:53

    If i understand you correctly you need to compare them as pictures. There is one very simple, but effective solution here - it's called Sikuli.

    What tools can I use to find which image of collection is most similar to mine?

    This tool is working very good with the image-processing and is not only capable to find if your card(image) is similar to what you have already defined as pattern, but also search partial image content (so called rectangles).

    By default you can extend it's functionality via Python. Any ImageObject can be set to accept similarity_pattern in percentages and by doing so you'll be able to precisely find what you are looking for.

    Also another big advantage of this tool is that you can learn basics in one day.

    Hope this helps.

    0 讨论(0)
  • 2020-11-30 16:00

    I tried this by arranging the image data as a vector and taking the inner-product between the collection image vectors and the searched image vector. The vectors that are most similar will give the highest inner-product. I resize all the images to the same size to get equal length vectors so I can take inner-product. This resizing will additionally reduce inner-product computational cost and give a coarse approximation of the actual image.

    You can quickly check this with Matlab or Octave. Below is the Matlab/Octave script. I've added comments there. I tried varying the variable mult from 1 to 8 (you can try any integer value), and for all those cases, image Demystify gave the highest inner product with the card image. For mult = 8, I get the following ip vector in Matlab:

    ip =

    683007892

    558305537

    604013365

    As you can see, it gives the highest inner-product of 683007892 for image Demystify.

    % load images
    imCardPhoto = imread('0.png');
    imDemystify = imread('1.jpg');
    imAggressiveUrge = imread('2.jpg');
    imAbundance = imread('3.jpg');
    
    % you can experiment with the size by varying mult
    mult = 8;
    size = [17 12]*mult;
    
    % resize with nearest neighbor interpolation
    smallCardPhoto = imresize(imCardPhoto, size);
    smallDemystify = imresize(imDemystify, size);
    smallAggressiveUrge = imresize(imAggressiveUrge, size);
    smallAbundance = imresize(imAbundance, size);
    
    % image collection: each image is vectorized. if we have n images, this
    % will be a (size_rows*size_columns*channels) x n matrix
    collection = [double(smallDemystify(:)) ...
        double(smallAggressiveUrge(:)) ...
        double(smallAbundance(:))];
    
    % vectorize searched image. this will be a (size_rows*size_columns*channels) x 1
    % vector
    x = double(smallCardPhoto(:));
    
    % take the inner product of x and each image vector in collection. this
    % will result in a n x 1 vector. the higher the inner product is, more similar the
    % image and searched image(that is x)
    ip = collection' * x;
    

    EDIT

    I tried another approach, basically taking the euclidean distance (l2 norm) between reference images and the card image and it gave me very good results with a large collection of reference images (383 images) I found at this link for your test card image.

    Here instead of taking the whole image, I extracted the upper part that contains the image and used it for comparison.

    In the following steps, all training images and the test image are resized to a predefined size before doing any processing.

    • extract the image regions from training images
    • perform morphological closing on these images to get a coarse approximation (this step may not be necessary)
    • vectorize these images and store in a training set (I call it training set even though there's no training in this approach)
    • load the test card image, extract the image region-of-interest(ROI), apply closing, then vectorize
    • calculate the euclidean distance between each reference image vector and the test image vector
    • choose the minimum distance item (or the first k items)

    I did this in C++ using OpenCV. I'm also including some test results using different scales.

    #include <opencv2/opencv.hpp>
    #include <iostream>
    #include <algorithm>
    #include <string.h>
    #include <windows.h>
    
    using namespace cv;
    using namespace std;
    
    #define INPUT_FOLDER_PATH       string("Your test image folder path")
    #define TRAIN_IMG_FOLDER_PATH   string("Your training image folder path")
    
    void search()
    {
        WIN32_FIND_DATA ffd;
        HANDLE hFind = INVALID_HANDLE_VALUE;
    
        vector<Mat> images;
        vector<string> labelNames;
        int label = 0;
        double scale = .2;  // you can experiment with scale
        Size imgSize(200*scale, 285*scale); // training sample images are all 200 x 285 (width x height)
        Mat kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
    
        // get all training samples in the directory
        hFind = FindFirstFile((TRAIN_IMG_FOLDER_PATH + string("*")).c_str(), &ffd);
        if (INVALID_HANDLE_VALUE == hFind) 
        {
            cout << "INVALID_HANDLE_VALUE: " << GetLastError() << endl;
            return;
        } 
        do
        {
            if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
            {
                Mat im = imread(TRAIN_IMG_FOLDER_PATH+string(ffd.cFileName));
                Mat re;
                resize(im, re, imgSize, 0, 0);  // resize the image
    
                // extract only the upper part that contains the image
                Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
                // get a coarse approximation
                morphologyEx(roi, roi, MORPH_CLOSE, kernel);
    
                images.push_back(roi.reshape(1)); // vectorize the roi
                labelNames.push_back(string(ffd.cFileName));
            }
    
        }
        while (FindNextFile(hFind, &ffd) != 0);
    
        // load the test image, apply the same preprocessing done for training images
        Mat test = imread(INPUT_FOLDER_PATH+string("0.png"));
        Mat re;
        resize(test, re, imgSize, 0, 0);
        Mat roi = re(Rect(re.cols*.1, re.rows*35/285.0, re.cols*.8, re.rows*125/285.0));
        morphologyEx(roi, roi, MORPH_CLOSE, kernel);
        Mat testre = roi.reshape(1);
    
        struct imgnorm2_t
        {
            string name;
            double norm2;
        };
        vector<imgnorm2_t> imgnorm;
        for (size_t i = 0; i < images.size(); i++)
        {
            imgnorm2_t data = {labelNames[i], 
                norm(images[i], testre) /* take the l2-norm (euclidean distance) */};
            imgnorm.push_back(data); // store data
        }
    
        // sort stored data based on euclidean-distance in the ascending order
        sort(imgnorm.begin(), imgnorm.end(), 
            [] (imgnorm2_t& first, imgnorm2_t& second) { return (first.norm2 < second.norm2); });
        for (size_t i = 0; i < imgnorm.size(); i++)
        {
            cout << imgnorm[i].name << " : " << imgnorm[i].norm2 << endl;
        }
    }
    

    Results:

    scale = 1.0;

    demystify.jpg : 10989.6, sylvan_basilisk.jpg : 11990.7, scathe_zombies.jpg : 12307.6

    scale = .8;

    demystify.jpg : 8572.84, sylvan_basilisk.jpg : 9440.18, steel_golem.jpg : 9445.36

    scale = .6;

    demystify.jpg : 6226.6, steel_golem.jpg : 6887.96, sylvan_basilisk.jpg : 7013.05

    scale = .4;

    demystify.jpg : 4185.68, steel_golem.jpg : 4544.64, sylvan_basilisk.jpg : 4699.67

    scale = .2;

    demystify.jpg : 1903.05, steel_golem.jpg : 2154.64, sylvan_basilisk.jpg : 2277.42

    0 讨论(0)
  • 2020-11-30 16:01

    Thank you for posting some photos.

    I have coded an algorithm called Perceptual Hashing which I found by Dr Neal Krawetz. On comparing your images with the Card, I get the following percentage measures of similarity:

    Card vs. Abundance 79%
    Card vs. Aggressive 83%
    Card vs. Demystify 85%
    

    so, it is not an ideal discriminator for your image type, but kind of works somewhat. You may wish to play around with it to tailor it for your use case.

    I would calculate a hash for each of the images in your collection, one at a time and store the hash for each image just once. Then, when you get a new card, calculate its hash and compare it to the stored ones.

    #!/bin/bash
    ################################################################################
    # Similarity
    # Mark Setchell
    #
    # Calculate percentage similarity of two images using Perceptual Hashing
    # See article by Dr Neal Krawetz entitled "Looks Like It" - www.hackerfactor.com
    #
    # Method:
    # 1) Resize image to black and white 8x8 pixel square regardless
    # 2) Calculate mean brightness of those 64 pixels
    # 3) For each pixel, store "1" if pixel>mean else store "0" if less than mean
    # 4) Convert resulting 64bit string of 1's and 0's, 16 hex digit "Perceptual Hash"
    #
    # If finding difference between Perceptual Hashes, simply total up number of bits
    # that differ between the two strings - this is the Hamming distance.
    #
    # Requires ImageMagick - www.imagemagick.org
    #
    # Usage:
    #
    # Similarity image|imageHash [image|imageHash]
    # If you pass one image filename, it will tell you the Perceptual hash as a 16
    # character hex string that you may want to store in an alternate stream or as
    # an attribute or tag in filesystems that support such things. Do this in order
    # to just calculate the hash once for each image.
    #
    # If you pass in two images, or two hashes, or an image and a hash, it will try
    # to compare them and give a percentage similarity between them.
    ################################################################################
    function PerceptualHash(){
    
       TEMP="tmp$$.png"
    
       # Force image to 8x8 pixels and greyscale
       convert "$1" -colorspace gray -quality 80 -resize 8x8! PNG8:"$TEMP"
    
       # Calculate mean brightness and correct to range 0..255
       MEAN=$(convert "$TEMP" -format "%[fx:int(mean*255)]" info:)
    
       # Now extract all 64 pixels and build string containing "1" where pixel > mean else "0"
       hash=""
       for i in {0..7}; do
          for j in {0..7}; do
             pixel=$(convert "${TEMP}"[1x1+${i}+${j}] -colorspace gray text: | grep -Eo "\(\d+," | tr -d '(,' )
             bit="0"
             [ $pixel -gt $MEAN ] && bit="1"
             hash="$hash$bit"
          done
       done
       hex=$(echo "obase=16;ibase=2;$hash" | bc)
       printf "%016s\n" $hex
       #rm "$TEMP" > /dev/null 2>&1
    }
    
    function HammingDistance(){
       # Convert input hex strings to upper case like bc requires
       STR1=$(tr '[a-z]' '[A-Z]' <<< $1)
       STR2=$(tr '[a-z]' '[A-Z]' <<< $2)
    
       # Convert hex to binary and zero left pad to 64 binary digits
       STR1=$(printf "%064s" $(echo "obase=2;ibase=16;$STR1" | bc))
       STR2=$(printf "%064s" $(echo "obase=2;ibase=16;$STR2" | bc))
    
       # Calculate Hamming distance between two strings, each differing bit adds 1
       hamming=0
       for i in {0..63};do
          a=${STR1:i:1}
          b=${STR2:i:1}
          [ $a != $b ] && ((hamming++))
       done
    
       # Hamming distance is in range 0..64 and small means more similar
       # We want percentage similarity, so we do a little maths
       similarity=$((100-(hamming*100/64)))
       echo $similarity
    }
    
    function Usage(){
       echo "Usage: Similarity image|imageHash [image|imageHash]" >&2
       exit 1
    }
    
    ################################################################################
    # Main
    ################################################################################
    if [ $# -eq 1 ]; then
       # Expecting a single image file for which to generate hash
       if [ ! -f "$1" ]; then
          echo "ERROR: File $1 does not exist" >&2
          exit 1
       fi
       PerceptualHash "$1" 
       exit 0
    fi
    
    if [ $# -eq 2 ]; then
       # Expecting 2 things, i.e. 2 image files, 2 hashes or one of each
       if [ -f "$1" ]; then
          hash1=$(PerceptualHash "$1")
       else
          hash1=$1
       fi
       if [ -f "$2" ]; then
          hash2=$(PerceptualHash "$2")
       else
          hash2=$2
       fi
       HammingDistance $hash1 $hash2
       exit 0
    fi
    
    Usage
    
    0 讨论(0)
  • 2020-11-30 16:11

    New method!

    It seems that the following ImageMagick command, or maybe a variation of it, depending on looking at a greater selection of your images, will extract the wording at the top of your cards

    convert aggressiveurge.jpg -crop 80%x10%+10%+10% crop.png
    

    which takes the top 10% of your image and 80% of the width (starting at 10% in from the top left corner and stores it in crop.png as follows:

    enter image description here

    And if your run that through tessseract OCR as follows:

    tesseract crop.png agg
    

    you get a file called agg.txt containing:

    E‘ Aggressive Urge \L® E
    

    which you can run through grep to clean up, looking only for upper and lower case letters adjacent to each other:

    grep -Eo "\<[A-Za-z]+\>" agg.txt
    

    to get

    Aggressive Urge
    

    :-)

    0 讨论(0)
  • 2020-11-30 16:15

    I also tried a normalised cross-correlation of each of your images with the card, like this:

    #!/bin/bash
    size="300x400!"
    convert card.png -colorspace RGB -normalize -resize $size card.jpg
    for i in *.jpg
    do 
       cc=$(convert $i -colorspace RGB -normalize -resize $size JPG:- | \
       compare - card.jpg -metric NCC null: 2>&1)
       echo "$cc:$i"
    done | sort -n
    

    and I got this output (sorted by match quality):

    0.453999:abundance.jpg
    0.550696:aggressive.jpg
    0.629794:demystify.jpg
    

    which shows that the card correlates best with demystify.jpg.

    Note that I resized all images to the same size and normalized their contrast so that they could be readily compared and effects resulting from differences in contrast are minimised. Making them smaller also reduces the time needed for the correlation.

    0 讨论(0)
提交回复
热议问题