How to track the barcode with highest confidence

自作多情 提交于 2021-02-10 14:33:51

问题


I am using vision framework to detect barcodes. I want to show a rect around the barcode with highest confidence on live video, meaning, I want to track that rect to the barcode seen on the live preview.

So I have this code to detect the barcodes within a roi.

lazy var barcodeRequest: VNDetectBarcodesRequest = {
    let barcodeRequest = VNDetectBarcodesRequest {[weak self] request, error in
      guard error == nil else {
        print ("ERRO: \(error?.localizedDescription ?? "error")")
        return
      }
      self?.resultClassification(request)
    }
    barcodeRequest.regionOfInterest = CGRect(x: 0,
                                             y: 0.3,
                                             width: 1,
                                             height: 0.4)
    return barcodeRequest
  }()

This method will fire when the barcodes are detected

func resultClassification(_ request: VNRequest) {
    guard let barcodes = request.results,
          let potentialCodes = barcodes as? [VNBarcodeObservation]
    else { return }
    
    // choose the bar code with highestConfidence
    let highestConfidenceBarcodeDetected = potentialCodes.max(by: {$0.confidence < $1.confidence})
    
    // do something with highestConfidenceBarcodeDetected

    // 1
  }

This is my problem.

Now that I have the highest confidence barcode, I want to track it around the screen. So, I think I will have to add code at // 1.

But before that I have to define this for the tracker:

var inputObservation:VNDetectedObjectObservation!


lazy var barcodeTrackingRequest: VNTrackObjectRequest = {
  let barcodeTrackingRequest = VNTrackObjectRequest(detectedObjectObservation: inputObservation) { [weak self] request, error in
    guard error == nil else {
      print("Detection error: \(String(describing: error)).")
      return
    }
    self?.resultClassificationTracker(request)
  }
  return barcodeTrackingRequest
}()

func resultClassificationTracker(_ request:VNRequest) {
  // all I want from this is to store the boundingbox on a var  
}

Now, how do I connect these two pieces of code, so resultClassificationTracker fires every time I get a bounding box value for the tracker?


回答1:


I did something similar a while ago and wrote an article on it. It's for VNRecognizeTextRequest not VNDetectBarcodesRequest, but it's similar. This is what I did:

  • Perform VNImageRequestHandler continuously (once it finishes, start another again)
  • Store the detection indicator view in a property var previousTrackingView: UIView?
  • Animate the detection indicator to the new rectangle whenever the request handler finishes
  • Use Core Motion to detect device movement, and adjust the frame of the detection indicator

Here is the result:

As you can see the height/y coordinate is not very accurate. My guess is that Vision only needs a horizontal line to scan barcodes - like those laser scanners in grocery stores - so it doesn't return the full height. But that is a different problem.

Perform VNImageRequestHandler continuously (once it finishes, start another again)

For this, I'm making a property busyPerformingVisionRequest, and whenever this is false, I call the Vision request. This is inside the didOutput function which gets called whenever the camera frame changes.


class ViewController: AVCaptureVideoDataOutputSampleBufferDelegate {

    var busyPerformingVisionRequest = false

    func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
        guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }

        if busyPerformingVisionRequest == false {
            lookForBarcodes(in: pixelBuffer) /// start the vision as many times as possible
        }
    }
}

Store the detection indicator view in a property var previousTrackingView: UIView?

Below is my Vision handler that gets called when the Vision request completes. I first set busyPerformingVisionRequest to false, so another Vision request can be made. Then I convert the bounding box to screen coordinates and call self.drawTrackingView(at: convertedRect).

func resultClassificationTracker(request: VNRequest?, error: Error?) {
    busyPerformingVisionRequest = false
    
    if let results = request?.results {
        if let observation = results.first as? VNBarcodeObservation {
            
            var x = observation.boundingBox.origin.x
            var y = 1 - observation.boundingBox.origin.y
            var height = CGFloat(0) /// ignore the bounding height
            var width = observation.boundingBox.width
            
            /// we're going to do some converting
            let convertedOriginalWidthOfBigImage = aspectRatioWidthOverHeight * deviceSize.height
            let offsetWidth = convertedOriginalWidthOfBigImage - deviceSize.width
            
            /// The pixelbuffer that we got Vision to process is bigger then the device's screen, so we need to adjust it
            let offHalf = offsetWidth / 2
            
            width *= convertedOriginalWidthOfBigImage
            height = width * (CGFloat(9) / CGFloat(16))
            x *= convertedOriginalWidthOfBigImage
            x -= offHalf
            y *= deviceSize.height
            y -= height
            
            let convertedRect = CGRect(x: x, y: y, width: width, height: height)
            
            DispatchQueue.main.async {
                self.drawTrackingView(at: convertedRect)
            }
            
        }
    }
}

Animate the detection indicator to the new rectangle whenever the request handler finishes

This is my function drawTrackingView. If there is a tracking rectangle view drawn already, it animates it to the new frame. If not, it just adds it as a subview.

func drawTrackingView(at rect: CGRect) {
    if let previousTrackingView = previousTrackingView { /// already drawn one previously, just change the frame now
        UIView.animate(withDuration: 0.8) {
            previousTrackingView.frame = rect
        }
        
    } else { /// add it as a subview
        let trackingView = UIView(frame: rect)
        drawingView.addSubview(trackingView)
        trackingView.backgroundColor = UIColor.blue.withAlphaComponent(0.2)
        trackingView.layer.borderWidth = 3
        trackingView.layer.borderColor = UIColor.blue.cgColor
        
        
        previousTrackingView = trackingView
    }
}

Use Core Motion to detect device movement, and adjust the frame of the detection indicator

I first store a couple motion-related properties. Then, in viewDidLoad, I start the motion updates.

-----ViewController.swift-----

/// motionManager will be what we'll use to get device motion
var motionManager = CMMotionManager()
    
/// this will be the "device’s true orientation in space" (Source: https://nshipster.com/cmdevicemotion/)
var initialAttitude: CMAttitude?
     
/// we'll later read these values to update the highlight's position
var motionX = Double(0) /// aka Roll
var motionY = Double(0) /// aka Pitch

override func viewDidLayoutSubviews() {
    super.viewDidLayoutSubviews()
    
    /// viewDidLoad() is often too early to get the first initial attitude, so we use viewDidLayoutSubviews() instead
    if let currentAttitude = motionManager.deviceMotion?.attitude {
        /// we populate initialAttitude with the current attitude
        initialAttitude = currentAttitude
    }
    
}
override func viewDidLoad() {
    super.viewDidLoad()
    
    /// This is how often we will get device motion updates
    /// 0.03 is more than often enough and is about the rate that the video frame changes
    motionManager.deviceMotionUpdateInterval = 0.03
    
    motionManager.startDeviceMotionUpdates(to: .main) {
        [weak self] (data, error) in
        guard let data = data, error == nil else {
            return
        }
        
        /// This function will be called every 0.03 seconds
        self?.updateTrackingFrames(attitude: data.attitude)
    }

    ...
}

Every 0.03 seconds I will call updateTrackingFrames, which will read the new physical movement of the device. This is meant to be reduce jitter, like when your user's hands are shaking.

func updateTrackingFrames(attitude: CMAttitude) {
    /// initialAttitude is an optional that points to the reference frame that the device started at
    /// we set this when the device lays out it's subviews on the first launch
    if let initAttitude = initialAttitude {
        
        /// We can now translate the current attitude to the reference frame
        attitude.multiply(byInverseOf: initAttitude)
        
        /// Roll is the movement of the phone left and right, Pitch is forwards and backwards
        let rollValue = attitude.roll.radiansToDegrees
        let pitchValue = attitude.pitch.radiansToDegrees
        
        /// This is a magic number, but for simplicity, we won't do any advanced trigonometry -- also, 3 works pretty well
        let conversion = Double(3)
        
        /// Here, we figure out how much the values changed by comparing against the previous values (motionX and motionY)
        let differenceInX = (rollValue - motionX) * conversion
        let differenceInY = (pitchValue - motionY) * conversion
        
        /// Now we adjust the tracking view's position
        if let previousTrackingView = previousTrackingView {
            previousTrackingView.frame.origin.x += CGFloat(differenceInX)
            previousTrackingView.frame.origin.y += CGFloat(differenceInY)
        }
        
        /// finally, we put the new attitude values into motionX and motionY so we can compare against them in 0.03 seconds (the next time this function is called)
        motionX = rollValue
        motionY = pitchValue
    }
}

This Core Motion implementation isn't very accurate - I hardcode the multiplier constant (Double(3)) that adjusts the frame of the tracking indicator. But it's enough to cancel out small jitter.

Here is the final repo: https://github.com/aheze/BarcodeScanner



来源:https://stackoverflow.com/questions/66030924/how-to-track-the-barcode-with-highest-confidence

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!