Vision Framework Tutorial for iOS: Scanning Barcodes

In this Vision Framework tutorial, you’ll learn how to use your iPhone’s camera to scan QR codes and automatically open encoded URLs in Safari. By Emad Ghorbaninia.

4.7 (12) · 2 Reviews

Download materials
Save for later
Share
You are currently viewing page 2 of 3 of this article. Click here to view the first page.

Vision and the Camera

The Vision Framework operates on still images. Of course, when you use the camera on your iPhone, the image moves smoothly, as you would expect from video. However, video is made up of a collection of still images played one after the other, almost like a flip book.

Cartoon iPhone holding a camera

When using your camera with the Vision Framework, Vision splits the moving video into its component images and processes one of those images at some frequency called the sample rate.

In this tutorial, you’ll use the Vision Framework to find barcodes in images. The Vision Framework can read 17 different barcode formats, including UPC and QR codes.

In the coming sections, you’ll instruct your app to find QR codes and read their contents. Time to get started!

Using the Vision Framework

To implement the Vision Framework in your app, you’ll follow three basic steps:

  1. Request: When you want to detect something using the framework, you use a subclass of VNRequest to define the request.
  2. Handler: You process that request and perform image analysis for any detection using a subclass of VNImageRequestHandler.
  3. Observation: You analyze the results of your handled request with a subclass of VNObservation.

Time to create your first Vision request.

Creating a Vision Request

Vision provides VNDetectBarcodesRequest to detect a barcode in an image. You’ll implement it now.

In ViewController.swift, find // TODO: Make VNDetectBarcodesRequest variable at the top of the file and add the following code right after it:

lazy var detectBarcodeRequest = VNDetectBarcodesRequest { request, error in
  guard error == nil else {
    self.showAlert(
      withTitle: "Barcode error",
      message: error?.localizedDescription ?? "error")
    return
  }
  self.processClassification(request)
}

In this code, you set up a VNDetectBarcodesRequest that will detect barcodes when called. When the method thinks it found a barcode, it’ll pass the barcode on to processClassification(_:). You’ll define processClassification(_:) in a moment.

But first, you need to revisit video and sample rates.

Vision Handler

Remember that video is a collection of images, and the Vision Framework processes one of those images at some frequency. To set up your video feed accordingly, find setupCameraLiveView() and locate the TODO you left earlier: // TODO: Set video sample rate. Then, add this code right after the comment, and before the call to addOutput(_:):

captureOutput.videoSettings = 
  [kCVPixelBufferPixelFormatTypeKey as String: Int(kCVPixelFormatType_32BGRA)]
captureOutput.setSampleBufferDelegate(
  self, 
  queue: DispatchQueue.global(qos: DispatchQoS.QoSClass.default))

In this code, you set your video stream’s pixel format to 32-bit BGRA. Then, you set self as the delegate for the sample buffer. When new images are available in the buffer, Vision calls the appropriate delegate method from AVCaptureVideoDataOutputSampleBufferDelegate.

Because you’ve passed self as the delegate, you must conform ViewController to AVCaptureVideoDataOutputSampleBufferDelegate. Your class already does this and has a single callback method defined: captureOutput(_:didOutput:from:). Find this method and insert the following after // TODO: Live Vision:

// 1
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { 
  return 
}

// 2
let imageRequestHandler = VNImageRequestHandler(
  cvPixelBuffer: pixelBuffer,
  orientation: .right)

// 3
do {
  try imageRequestHandler.perform([detectBarcodeRequest])
} catch {
  print(error)
}

Here you:

  1. Get an image out of sample buffer, like a page out of a flip book.
  2. Make a new VNImageRequestHandler using that image.
  3. Perform the detectBarcodeRequest using the handler.

Live streaming with a cartoon iPhone

Vision Observation

Think back to the section on the Vision Request. There, you built the detectBarcodeRequest which calls processClassification(_:) if it thinks it found a barcode. For your last step, you’ll fill out processClassification(_:) to analyze the result of the handled request.

In processClassification(_:), locate // TODO: Main logic and add the following code right below it:

// 1
guard let barcodes = request.results else { return }
DispatchQueue.main.async { [self] in
  if captureSession.isRunning {
    view.layer.sublayers?.removeSubrange(1...)

    // 2
    for barcode in barcodes {
      guard
        // TODO: Check for QR Code symbology and confidence score
        let potentialQRCode = barcode as? VNBarcodeObservation 
        else { return }

      // 3
      showAlert(
        withTitle: potentialQRCode.symbology.rawValue,
        // TODO: Check the confidence score
        message: potentialQRCode.payloadStringValue ?? "" )
    }
  }
}

In this code, you:

  1. Get a list of potential barcodes from the request.
  2. Loop through the potential barcodes to analyze each one individually.
  3. If one of the results happens to be a barcode, you show an alert with the barcode type and the string encoded in the barcode.

Build and run again. This time, point your camera at a barcode.

Booooom you scanned it!

Note: You can find a sample barcode and sample QR code in the project folder, under Sample/barcode.png and Sample/qrcode.png respectively.

Scanning a barcode and a QR code with the Vision Framework

So far, so good. But what if there was a way for you to know with what certainty the object you point at is actually a barcode? More on this next…

Adding a Confidence Score

So far, you’ve worked extensively with AVCaptureSession and the Vision Framework. But there are more things you can do to tighten your implementation. Specifically, you can limit your Vision Observation to recognize only QR type barcodes and you can make sure the Vision Framework is certain it’s found a QR code in an image.

Cartoon iPhone scientist

Whenever your barcode observer analyzes the result of a handled request, it sets a property called confidence. This property tells you the result’s confidence level, normalized to [0, 1], where 1 is the most confident.

Inside processClassification(_:), find// TODO: Check for QR Code symbology and confidence score and replace the guard:

guard 
  let potentialQRCode = barcode as? VNBarcodeObservation 
  else { return }

with:

guard 
  // TODO: Check for QR Code symbology and confidence score
  let potentialQRCode = barcode as? VNBarcodeObservation,
  potentialQRCode.confidence > 0.9
  else { return }

Here you ensure Vision is at least 90% confident it’s found a barcode.

Now in the same method, locate // TODO: Check the confidence score. The message key’s value below is currently potentialQRCode.payloadStringValue ?? "". Change it to:

String(potentialQRCode.confidence)

Now, instead of showing the barcode’s payload in the alert, you’ll show the confidence score. Because the score is a number, you coalesced the value to a string so it can display in the alert.

Build and run. When you scan the sample QR code, you’ll see the confidence score in the alert that pops up.

Vision Framework is 100% confident it sees a QR code

Nicely done! As you see, the confidence score of the sample QR code is quite high, meaning Vision is certain this is actually a QR code.