How to recognize UI elements in image?

雨燕双飞 提交于 2019-12-21 23:49:46

问题


I am trying to make an automator tool and am experimenting with a type of recording which takes screen shots and records user inputs. The idea would be for user to take a snapshot and and highlight a square on the snapshot of the "submit" button. During playback, the program would take a sceenshot of the open window, and find the coordinates of the button by searching for the snapshot. So I need an algorithm to search an image for an exact (or very close) image of the button. The algorithms I've found so far compare image likeness but cannot find it in a subimage, and algorithms for object recognition seem a bit over the top considering the "object" im trying to find will be a near perfect match. Any ideas?


回答1:


What you need is an efficient feature extraction method. This will depend on what you're looking for, but let's assume you're looking for the Send button in this image:

One of the characteristic features of this button is that it includes a pair of parallel line segments at the top and bottom. The same applies to the two text input fields, but for the button, this offset is exactly 17 pixels.

This is what you get if you calculate the maximum pixel values of the source image together with itself shifted vertically by 17 pixels:

The Send button now appears as a solid horizontal line. You can detect this quite easily by thresholding the image and looking for an unbroken sequence of black pixels. Just for reference, here's what I obtained after applying a 10px horizontal motion blur and thresholding at a grey level of 128:

This process will identify candidate positions quite quickly. You can then subject these locations to stronger techniques like 2D convolution and OCR without too much loss of performance.




回答2:


The following tools can help you with that:

  • Prefab: http://github.com/prefab
  • Sikuli: http://www.sikuli.org



回答3:


  1. find a distinct feature in the button image

    for example can use edge color neighboring the button face color or derivation, shape or average color of square sub image (8x8 pixels ...)

  2. search the snapshot for this feature

    I would use average color for start so divide image to N x N pixel areas and compute their average color. If you find square with similar average color to your button average colors then you have probable location.

  3. after this you can brute force attack the near area if it has your button

    in this stage do not compare your colors directly (can be distorted by anti-aliasing and filters ...). Better way would be to compare derivations +/- some accuracy. You can make an coefficient of probable button presence:

    p(x,y)=count(matching pixels) / (button pixels)
    

    and if it is close enough to 1.0 then you found your button.

PS. in stage 3 you can use Grayscale images to simplify things



来源:https://stackoverflow.com/questions/19948702/how-to-recognize-ui-elements-in-image

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!