Video Depth Maps Tutorial for iOS: Getting Started

In this iOS video depth maps tutorial, you’ll harness iOS 13’s video depth maps to apply realtime video filters and create a special effects masterpiece! By Owen L Brown.

Leave a rating/review
Download materials
Save for later
Share
Update note: Owen Brown updated this tutorial for Swift 5, iOS 13 and Xcode 11. Yono Mittlefehldt wrote the original.

Admit it. Ever since you took your first video with the iPhone, you’ve had a burning desire to break into Hollywood. But you’re asking yourself, how can I do it?

Simple! You can use your iOS development skills to enhance your videos, become a special effects genius and take Hollywood by storm.

In this video depth maps tutorial, you’ll learn how to:

  • Request depth information for a video feed.
  • Manipulate the depth information.
  • Combine the video feed with depth data and filters to create an SFX masterpiece.

Since the core of the project deals with depth maps, it’d be nice to know what depth maps are and how the iPhone gets them before you get started.

What are Depth Maps?

A depth map is distance data of surfaces from the camera’s point of view. This data is tied to a given image or video and, once it’s grabbed, you can do some pretty cool things with it.

For example, the image below displays closer objects in white and far away ones in black.

Depth Map Example

Using two offset cameras, the iPhone calculates the relative distances of objects. The process is very similar to your eyes working together to determine depth perception.

Filters

Filtering is another concept important to depth data handling. There are many ways to filter data, but for this tutorial you’ll focus on two:

  • High-pass: These filters only keep values above a certain threshold.
  • Band-pass: These filters only keep values between a minimum and maximum range.

That’s enough theory for now!

Note: If you’re new to Apple’s Depth Data API, you may want to start with Image Depth Maps Tutorial for iOS: Getting Started. That tutorial also includes some good background information about how the iPhone gets depth information.

OK, it’s time to launch Xcode and get your formal wear ready for the Oscars!

Getting Started

For this video depth maps tutorial, you need Xcode 11 or later. You also need an iPhone with dual cameras on the back because that’s how the iPhone generates depth information. Since you need to run this app on a device, not the simulator, you also need an Apple Developer account.

Once you have everything ready, download and explore the materials for this tutorial by clicking the Download Materials button at the top or bottom of this page.

Open the starter project and select your development team in the Signing & Capabilities section of the Project Settings. Build and run it on your device. You’ll see something like this:

video depth maps tutorial starter project

Note: In order to capture depth information, the iPhone has to set the wide camera zoom to match the telephoto camera zoom. Therefore, the video feed in the app is zoomed in compared to the stock camera app.

At this point, the app doesn’t do much. That’s where you come in!

Capturing Video Depth Maps Data

Before you can capture depth data for videos you need to add an AVCaptureDepthDataOutput object to the AVCaptureSession.

Apple added AVCaptureDepthDataOutput in iOS 11 specifically to handle depth data, as the name suggests.

Open DepthVideoViewController.swift and add the following lines to the bottom of configureCaptureSession():

// 1
let depthOutput = AVCaptureDepthDataOutput()
// 2
depthOutput.setDelegate(self, callbackQueue: dataOutputQueue)
// 3
depthOutput.isFilteringEnabled = true
// 4
session.addOutput(depthOutput)
// 5
let depthConnection = depthOutput.connection(with: .depthData)
// 6
depthConnection?.videoOrientation = .portrait

Here’s the step-by-step breakdown:

  1. You create a new AVCaptureDepthDataOutput object.
  2. Then, you set the current view controller as the delegate for the new object. The callbackQueue parameter is the dispatch queue on which to call the delegate methods. For now, ignore the error — you’ll fix it later.
  3. Enable filtering on the depth data to take advantage of Apple’s algorithms to fill in any holes in the data.
  4. At this point, you’re ready to add the configured AVCaptureDepthDataOutput to the AVCaptureSession.
  5. Finally, get the AVCaptureConnection for the depth output in order to…
  6. …ensure the video orientation of the depth data matches the video feed.

Simple, right?

But hang on! Before you build and run the project, you first need to tell the app what to do with this depth data. That’s where the delegate method comes in.

Still in DepthVideoViewController.swift, add the following extension and delegate method at the end of the file:

// MARK: - Capture Depth Data Delegate Methods
extension DepthVideoViewController: AVCaptureDepthDataOutputDelegate {
  func depthDataOutput(
    _ output: AVCaptureDepthDataOutput,
    didOutput depthData: AVDepthData,
    timestamp: CMTime,
    connection: AVCaptureConnection) {
  }
}

This method gets called every time the camera records more depth data. Add the following code to the method:

// 1
guard previewMode != .original else {
  return
}

var convertedDepth: AVDepthData
// 2
let depthDataType = kCVPixelFormatType_DisparityFloat32
if depthData.depthDataType != depthDataType {
  convertedDepth = depthData.converting(toDepthDataType: depthDataType)
} else {
  convertedDepth = depthData
}

// 3
let pixelBuffer = convertedDepth.depthDataMap
// 4
pixelBuffer.clamp()
// 5
let depthMap = CIImage(cvPixelBuffer: pixelBuffer)

// 6
DispatchQueue.main.async {
  self.depthMap = depthMap
}

Here’s what’s happening:

  1. First, optimize this function to create a depth map only if the current preview mode is anything that would use the depth map.
  2. Next, ensure the depth data is the format you need: 32-bit floating point disparity information. Disparity tells you how much one image is shifted compared to another.
  3. Save the depth data map from the AVDepthData object as a CVPixelBuffer, an efficient data structure for holding a bunch of pixels.
  4. Using an extension included in the project, clamp the pixels in the pixel buffer to keep them between 0.0 and 1.0.
  5. Convert the pixel buffer into a CIImage.
  6. Finally, store this in a class property for later use.

Phew! You’re probably itching to run this now. But before you do, there’s one small addition you need to make to view the depth map: You need to display it!

Find the AVCaptureVideoDataOutputSampleBufferDelegate extension and look for the switch statement in captureOutput(_:didOutput:from:). Add the following case:

case (.depth, _, _):
  previewImage = depthMap ?? image
Note: In this statement, you’re actually switching on a tuple containing three objects. It might look strange now, but it’ll really help later on in the tutorial when you come to changing the previewImage for each different filter.

Build and run the project. Tap the Depth segment of the segmented control at the bottom.

capturing video depth maps in an ios app

This is the visual representation of the depth data captured alongside the video data.