apple-vision | 易学教程

Bounding Box from VNDetectRectangleRequest is not correct size when used as child VC

阅读更多关于 Bounding Box from VNDetectRectangleRequest is not correct size when used as child VC

问题 I am trying to use VNDetectRectangleRequest from Apple's Vision framework to automatically grab a picture of a card. However when I convert the points to draw the rectangle, it is misshapen and does not follow the rectangle is it should. I have been following this article pretty closely One major difference is I am embedding my CameraCaptureVC in another ViewController so that the card will be scanned only when it is in this smaller window. Below is how I set up the camera vc in the parent vc

Bounding Box from VNDetectRectangleRequest is not correct size when used as child VC

阅读更多关于 Bounding Box from VNDetectRectangleRequest is not correct size when used as child VC

Swiftui getting an image's displaying dimensions

阅读更多关于 Swiftui getting an image's displaying dimensions

问题 I'm trying to get the dimensions of a displayed image to draw bounding boxes over the text I have recognized using apple's Vision framework. So I run the VNRecognizeTextRequest uppon the press of a button with this funcion func readImage(image:NSImage, completionHandler:@escaping(([VNRecognizedText]?,Error?)->()), comp:@escaping((Double?,Error?)->())) { var recognizedTexts = [VNRecognizedText]() var rr = CGRect(x: 0, y: 0, width: image.size.width, height: image.size.height) let requestHandler

Swiftui getting an image's displaying dimensions

阅读更多关于 Swiftui getting an image's displaying dimensions

Apple Vision framework – Text extraction from image

阅读更多关于 Apple Vision framework – Text extraction from image

问题 I am using Vision framework for iOS 11 to detect text on image. The texts are getting detected successfully, but how we can get the detected text? 回答1: Not exactly a dupe but similar to: Converting a Vision VNTextObservation to a String You need to either use CoreML or another library to perform OCR (SwiftOCR, etc.) 回答2: In Apple Vision you can easily extract text from image using VNRecognizeTextRequest class, allowing you to make an image analysis request that finds and recognizes text in an

VNDetectTextRectanglesRequest Not Working For Less Than 3 Digits

阅读更多关于 VNDetectTextRectanglesRequest Not Working For Less Than 3 Digits

问题 I'm experimenting with Apple's Vision framework to detect the location of characters (letters, numbers, etc). Why can't I get the VisionBasics demo project to detect text in images with less than 3 digits? I've already tried binarizing the image by reducing saturation and increasing contrast. I even tried inverting the black and white portions, but it didn't improve the results. The 2-digit images are approximately 28x24 pixels. Link to Xcode Project: https://docs-assets.developer.apple.com

Merge images using “VNImageHomographicAlignmentObservation” class

阅读更多关于 Merge images using “VNImageHomographicAlignmentObservation” class

问题 I am trying to merge two images using VNImageHomographicAlignmentObservation , I am currently getting a 3d matrix that looks like this: simd_float3x3([ [0.99229, -0.00451023, -4.32607e-07)], [0.00431724,0.993118, 2.38839e-07)], [-72.2425, -67.9966, 0.999288)]], ) But I don't know how to use these values to merge into one image. There doesn't seem to be any documentation on what these values even mean. I found some information on transformation matrices here: Working with matrices. But so far

VNFaceObservation BoundingBox Not Scaling In Portrait Mode

阅读更多关于 VNFaceObservation BoundingBox Not Scaling In Portrait Mode

问题 For reference, this stems from a question in the Vision API . I am working to use Vision to detect faces in an image via a VNDetectFaceRectanglesRequest , which is successfully functioning in terms of determining the correct number of faces in an image and providing the boundingBox for each face. My trouble is that due to my UIImageView (which holds the UIImage in question) is using a .scaleAspectFit content mode, I am having immense difficulty in properly drawing the bounding box in portrait

Vision, VNDetectTextRectanglesRequest - can't recognize single number as region

阅读更多关于 Vision, VNDetectTextRectanglesRequest - can't recognize single number as region

问题 I want to use VNDetectTextRectanglesRequest from a Vision framework to detect regions in an image containing only one character, number '9', with the white background. I'm using following code to do this: private func performTextDetection() { let textRequest = VNDetectTextRectanglesRequest(completionHandler: self.detectTextHandler) textRequest.reportCharacterBoxes = true textRequest.preferBackgroundProcessing = false let handler = VNImageRequestHandler(cgImage: loadedImage.cgImage!, options:

Merge images using “VNImageHomographicAlignmentObservation” class

阅读更多关于 Merge images using “VNImageHomographicAlignmentObservation” class

I am trying to merge two images using VNImageHomographicAlignmentObservation , I am currently getting a 3d matrix that looks like this: simd_float3x3([ [0.99229, -0.00451023, -4.32607e-07)], [0.00431724,0.993118, 2.38839e-07)], [-72.2425, -67.9966, 0.999288)]], ) But I don't know how to use these values to merge into one image. There doesn't seem to be any documentation on what these values even mean. I found some information on transformation matrices here: Working with matrices . But so far nothing else has helped me... Any suggestions? My Code: func setup() { let floatingImage = UIImage