how do i get coordinates of image shown in opencv

后端 未结 2 1045
感情败类
感情败类 2020-12-18 12:56

Sorry but title doesnt really make sense

i am trying to make an ai that clicks on the ball to make it bounce. for context heres a picture of the application

相关标签:
2条回答
  • 2020-12-18 13:14

    I was working on your other, related question when you deleted it and see you are having performance issues in locating the ball. As your ball appears to be on a nice, simple white background (apart from the score and the close button at top right), there are easier/faster ways of finding the ball.

    First, work in greyscale so that you only have 1 channel, instead of 3 channels of RGB to process - that is generally faster.

    Then, overwrite the score and menu at top-right with white pixels so that the only thing left in the image is the ball. Now invert the image so that all the whites become black, then you can use findNonZero() to find anything that is not the background, i.e. the ball.

    Now find the lowest and highest coordinate in the y-direction and average them for the centre of the ball, likewise in the x-direction for the other way.

    #!/usr/bin/env python3
    
    # Load image - work in greyscale as 1/3 as many pixels
    im = cv2.imread('ball.png',cv2.IMREAD_GRAYSCALE)
    
    # Overwrite "Current Best" with white - these numbers will vary depending on what you capture
    im[134:400,447:714] = 255
    
    # Overwrite menu and "Close" button at top-right with white - these numbers will vary depending on what you capture
    im[3:107,1494:1726] = 255
    
    # Negate image so whites become black
    im=255-im
    
    # Find anything not black, i.e. the ball
    nz = cv2.findNonZero(im)
    
    # Find top, bottom, left and right edge of ball
    a = nz[:,0,0].min()
    b = nz[:,0,0].max()
    c = nz[:,0,1].min()
    d = nz[:,0,1].max()
    print('a:{}, b:{}, c:{}, d:{}'.format(a,b,c,d))
    
    # Average top and bottom edges, left and right edges, to give centre
    c0 = (a+b)/2
    c1 = (c+d)/2
    print('Ball centre: {},{}'.format(c0,c1))
    

    That gives:

    a:442, b:688, c:1063, d:1304
    Ball centre: 565.0,1183.5
    

    which, if I draw a red box in shows:

    The processing takes 845 microseconds on my Mac, or less than a millisecond, which corresponds to 1,183 frames per second. Obviously you have your time to grab the screen, but I can't control that.

    Note that you could also resize the image down by a factor of say 4 (or maybe 8 or 16) in each direction and still be sure of finding the ball and that may make it even faster.

    Keywords: Ball, track, tracking, locating, finding, position of, image, image processing, python, OpenCV, numpy, bounding box, bbox.

    0 讨论(0)
  • 2020-12-18 13:30

    You can do it like this:

    1. crop an image of the ball from a screenshot or so, sth. like

    img = cv2.imread("screenshot.jpg")
    crop_img = img[y:y+h, x:x+w] # you will have to look for the parameters by trial and error
    

    2. use template matching to look where the ball is in your image

    3. get the point in the middle of the resulting rectangle and move your mouse there

    I hope this helps, if you need more help on how to achieve any of this feel free to ask

    0 讨论(0)
提交回复
热议问题