Converting CGImage to python image (pil/opencv)

谁说胖子不能爱 提交于 2019-12-25 02:51:28

问题


I want to do some pattern recognition on my screen and will use the Quartz/PyObjc libraries to get the screenshots.

I get the screenshot as a CGImage. I want to search for a pattern in it using the openCV library, but can't seem to find how to convert the data to be readable by opencv.

So what I want to do is this:

#get screenshot and reference pattern
img = getScreenshot() # returns CGImage instance, custom function, using Quartz
reference = cv2.imread('ref/reference_start.png') #get the reference pattern

#search for the pattern using the opencv library
result = cv2.matchTemplate(screen, reference, cv2.TM_CCOEFF_NORMED)

#this is what I need
minVal,maxVal,minLoc,maxLoc = cv2.minMaxLoc(result)

I have no idea how to do this and can't find information through google.


回答1:


To add to Arqu's answer, you may find it faster to use np.frombuffer instead of creating a PIL Image first if your ultimate goal is to use opencv or numpy, because np.frombuffer takes about the same time as Image.frombuffer, but saves you the step of converting from an Image to a numpy array (which takes about 100ms on my machine (everything else takes ~50ms)).

import Quartz.CoreGraphics as CG
from PIL import Image 
import time
import numpy as np

ct = time.time()
region = CG.CGRectInfinite

# Create screenshot as CGImage
image = CG.CGWindowListCreateImage(
    region,
    CG.kCGWindowListOptionOnScreenOnly,
    CG.kCGNullWindowID,
    CG.kCGWindowImageDefault)

width = CG.CGImageGetWidth(image)
height = CG.CGImageGetHeight(image)
bytesperrow = CG.CGImageGetBytesPerRow(image)

pixeldata = CG.CGDataProviderCopyData(CG.CGImageGetDataProvider(image))
image = np.frombuffer(pixeldata, dtype=np.uint8)
image = image.reshape((height, bytesperrow//4, 4))
image = image[:,:width,:]

print('elapsed:', time.time() - ct)



回答2:


I've been playing around with this also however I needed a bit more performance, so saving to a file and then reading from it again was a bit too slow. In the end after a lot of searching and fiddling around I came up with this:

#get_pixels returns a image reference from CG.CGWindowListCreateImage
imageRef = self.get_pixels()
pixeldata = CG.CGDataProviderCopyData(CG.CGImageGetDataProvider(imageRef))
image = Image.frombuffer("RGBA", (self.width, self.height), pixeldata, "raw", "RGBA", self.stride, 1)
#Color correction from BGRA to RGBA
b, g, r, a = image.split()
image = Image.merge("RGBA", (r, g, b, a))

Also note that since my image was not of a standard size (had to be padded) it had some weird behavior so I had to adapt the stride of the buffer, if you are taking full screen screenshots from standard screen widths you can go with a stride of 0 and it will calculate it automatically.

You can now convert from PIL format to a numpy array to make it easier to work with in OpenCV with:

image = np.array(image)



回答3:


Here is code that will take a screenshot and save it to a file. To read that in to PIL, just use the standard Image(path). This code is surprisingly fast if you keep the size of the region small. For an 800x800 pixel region, each shot takes less than 50ms on my i7. For the full resolution of a dual monitor setup (2880x1800 + 2560x1440), each shot takes about 1.9 seconds.

Source: https://github.com/troq/flappy-bird-player/blob/master/screenshot.py

import Quartz
import LaunchServices
from Cocoa import NSURL
import Quartz.CoreGraphics as CG

def screenshot(path, region=None):
    """saves screenshot of given region to path
    :path: string path to save to
    :region: tuple of (x, y, width, height)
    :returns: nothing
    """
    if region is None:
        region = CG.CGRectInfinite

    # Create screenshot as CGImage
    image = CG.CGWindowListCreateImage(
        region,
        CG.kCGWindowListOptionOnScreenOnly,
        CG.kCGNullWindowID,
        CG.kCGWindowImageDefault)

    dpi = 72 # FIXME: Should query this from somewhere, e.g for retina displays

    url = NSURL.fileURLWithPath_(path)

    dest = Quartz.CGImageDestinationCreateWithURL(
        url,
        LaunchServices.kUTTypePNG, # file type
        1, # 1 image in file
        None
        )

    properties = {
        Quartz.kCGImagePropertyDPIWidth: dpi,
        Quartz.kCGImagePropertyDPIHeight: dpi,
        }

    # Add the image to the destination, characterizing the image with
    # the properties dictionary.
    Quartz.CGImageDestinationAddImage(dest, image, properties)

    # When all the images (only 1 in this example) are added to the destination,
    # finalize the CGImageDestination object.
    Quartz.CGImageDestinationFinalize(dest)


if __name__ == '__main__':
    # Capture full screen
    screenshot("testscreenshot_full.png")

    # Capture region (100x100 box from top-left)
    region = CG.CGRectMake(0, 0, 100, 100)
    screenshot("testscreenshot_partial.png", region=region)



回答4:


Here's an enhanced version of Arqu's answer. PIL (at least Pillow) can load BGRA data directly, without need to split & merge.

width = Quartz.CGImageGetWidth(cgimg)
height = Quartz.CGImageGetHeight(cgimg)
pixeldata = Quartz.CGDataProviderCopyData(Quartz.CGImageGetDataProvider(cgimg))
bpr = Quartz.CGImageGetBytesPerRow(image)
# Convert to PIL Image.  Note: CGImage's pixeldata is BGRA
image = Image.frombuffer("RGBA", (width, height), pixeldata, "raw", "BGRA", bpr, 1)


来源:https://stackoverflow.com/questions/22938654/converting-cgimage-to-python-image-pil-opencv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!