Video Depth Maps Tutorial for iOS: Getting Started

Video Depth Maps Tutorial

Admit it. Ever since you took your first video with the iPhone, you’ve had a burning desire to break into Hollywood.

Heck, even Steven Soderbergh said he’s open to using only an iPhone to shoot movies.

Great! How can you compete with the man who brought the world Magic Mike XXL and some lesser known films such as Ocean’s Eleven and Traffic?

Simple! You can use your iOS development skills to enhance your videos, become a special effects genius and take Hollywood by storm.

So get ready, because in this video depth maps tutorial, you’ll learn how to:

  • Request depth information for a video feed.
  • Manipulate the depth information.
  • Combine the video feed with depth data and filters to create an SFX masterpiece.

Note: If you’re new to Apple’s Depth Data API, you may want to start with Image Depth Maps Tutorial for iOS: Getting Started. That tutorial also includes some nice background information on how the iPhone gets the depth information.

OK, it’s time to launch Xcode and get your formal wear ready for the Oscars!

Getting Started

For this video depth maps tutorial, you’ll need Xcode 9 or later. You’ll also need an iPhone with dual cameras on the back, which is how the iPhone generates depth information. An Apple Developer account is also required because you need to run this app on a device, not the simulator.

Once you have everything ready, download and explore the materials for this tutorial (you can find a link at the top or bottom of this tutorial).

Open the starter project, and build and run it on your device. You’ll see something like this:

Build & run starter

Note: In order to capture depth information, the iPhone has to set the wide camera zoom to match the telephoto camera zoom. Therefore, the video feed in the app is zoomed in compared to the stock camera app.

At this point, the app doesn’t do much. That’s where you come in!

Capturing Video Depth Maps Data

Capturing depth data for video requires adding an AVCaptureDepthDataOutput object to the AVCaptureSession.

AVCaptureDepthDataOutput was added in iOS 11 specifically to handle depth data, as the name suggests.

Open DepthVideoViewController.swift and add the following lines to the bottom of configureCaptureSession():

// 1
let depthOutput = AVCaptureDepthDataOutput()

// 2
depthOutput.setDelegate(self, callbackQueue: dataOutputQueue)

// 3
depthOutput.isFilteringEnabled = true

// 4
session.addOutput(depthOutput)

// 5
let depthConnection = depthOutput.connection(with: .depthData)

// 6
depthConnection?.videoOrientation = .portrait

Here’s the step-by-step breakdown:

  1. You create a new AVCaptureDepthDataOutput object
  2. Then you set the current view controller as the delegate for the new object. The callbackQueue parameter is the dispatch queue on which to call the delegate methods. For now, ignore the error; you’ll fix it later.
  3. You enable filtering on the depth data to take advantage of Apple’s algorithms to fill in any holes in the data.
  4. At this point, you’re ready to add the configured AVCaptureDepthDataOutput to the AVCaptureSession
  5. Here you get the AVCaptureConnection for the depth output in order to…
  6. …ensure the video orientation of the depth data matches the video feed.

Simple, right?

But hang on! Before you build and run the project, you first need to tell the app what to do with this depth data. That’s where the delegate method comes in.

Still in DepthVideoViewController.swift, add the following extension and delegate method at the end of the file:

// MARK: - Capture Depth Data Delegate Methods

extension DepthVideoViewController: AVCaptureDepthDataOutputDelegate {

  func depthDataOutput(_ output: AVCaptureDepthDataOutput,
                       didOutput depthData: AVDepthData,
                       timestamp: CMTime,
                       connection: AVCaptureConnection) {

    // 1
    if previewMode == .original {
      return
    }

    var convertedDepth: AVDepthData

    // 2
    if depthData.depthDataType != kCVPixelFormatType_DisparityFloat32 {
      convertedDepth = depthData.converting(toDepthDataType: kCVPixelFormatType_DisparityFloat32)
    } else {
      convertedDepth = depthData
    }

    // 3
    let pixelBuffer = convertedDepth.depthDataMap

    // 4
    pixelBuffer.clamp()

    // 5
    let depthMap = CIImage(cvPixelBuffer: pixelBuffer)

    // 6
    DispatchQueue.main.async { [weak self] in
      self?.depthMap = depthMap
    }
  }
}

Here’s what’s happening:

  1. You optimized this function to create a depth map only if the current preview mode is anything that would use the depth map.
  2. Next, you ensure the depth data is the format you need: 32 bit floating point disparity information.
  3. You save the depth data map from the AVDepthData object as a CVPixelBuffer.
  4. Using an extension included in the project, you then clamp the pixels in the pixel buffer to keep them between 0.0 and 1.0.
  5. You convert the pixel buffer into a CIImage and…
  6. …then you store this in a class variable for later use.

Phew! You’re probably itching to run this now, but before you do, there’s one small addition you need to make to view the depth map: you need to display it!

Find the AVCaptureVideoDataOutputSampleBufferDelegate extension and look for the switch statement in captureOutput(_:didOutput:from:). Add the following case:

case .depth:
  previewImage = depthMap ?? image

Build and run the project, and tap on the Depth segment of the segmented control at the bottom.

Build and run with depth

This is the visual representation of the depth data captured alongside the video data.

Video Resolutions And Frame Rates

There are a couple of things you should know about the depth data you’re capturing. It’s a lot of work for your iPhone to correlate the pixels between the two cameras and calculate the disparity.

Note: Confused by that last sentence? Check out the Image Depth Maps Tutorial for iOS: Getting Started. It has a nice explanation in the section, How Does The iPhone Do This?

To provide you with the best real-time data it can, the iPhone limits the resolutions and frame rates of the depth data it returns.

For instance, the maximum amount of depth data you can receive on an iPhone 7 Plus is 320 x 240 at 24 frames per second. The iPhone X is capable of delivering that data at 30 fps.

AVCaptureDevice does not allow you to set the depth frame rate independent of the video frame rate. Depth data must be delivered at the same frame rate or an even fraction of the video frame rate. Otherwise, a situation would arise where you have depth data but no video data, which is strange.

Because of this, you want to do two things:

  1. Set your video frame rate to ensure the maximum possible depth data frame rate.
  2. Determine the scale factor between your video data and your depth data. The scale factor is important when you start creating masks and filters.

Time to make your code better!

Again in DepthVideoViewController.swift, add the following to the bottom of configureCaptureSession():

// 1
let outputRect = CGRect(x: 0, y: 0, width: 1, height: 1)
let videoRect = videoOutput.outputRectConverted(fromMetadataOutputRect: outputRect)
let depthRect = depthOutput.outputRectConverted(fromMetadataOutputRect: outputRect)

// 2
scale = max(videoRect.width, videoRect.height) / max(depthRect.width, depthRect.height)

// 3    
do {
  try camera.lockForConfiguration()

  // 4
  if let frameDuration = camera.activeDepthDataFormat?
    .videoSupportedFrameRateRanges.first?.minFrameDuration {
    camera.activeVideoMinFrameDuration = frameDuration
  }

  // 5
  camera.unlockForConfiguration()
} catch {
  fatalError(error.localizedDescription)
}

Here’s the breakdown:

  1. You calculate a CGRect that defines the video and depth output in pixels. The methods map the full metadata output rect to the full resolution of the video and data outputs.
  2. Using the CGRect for both video and data output, you calculate the scaling factor between them. You take the maximum of the dimension because the depth data is actually delivered rotated by 90 degrees.
  3. Here you change the AVCaptureDevice configuration, so you need to lock it, which can throw an exception
  4. You then set the AVCaptureDevice‘s minimum frame duration (which is the inverse of the maximum frame rate) to be equal to the supported frame rate of the depth data
  5. Then you unlock the configuration you locked in step #3.

Ok, build and run the project. Whether or not you see a difference, your code is now more robust and future-proof. :]

What Can You Do With This Depth Data?

Well, much like in Image Depth Maps Tutorial for iOS: Getting Started, you can use this depth data to create a mask, and then use the mask to filter the original video feed.

You may have noticed a slider at the bottom of the screen for the Mask and Filtered segments. This slider controls the depth focus of the mask.

Currently, that slider seems to do nothing. That’s because there’s no visualization of the mask on the screen. You’re going to change that now!

Go back to depthDataOutput(_:didOutput:timestamp:connection:) in the AVCaptureDepthDataOutputDelegate extension. Just before DispatchQueue.main.async, add the following:

// 1
if previewMode == .mask || previewMode == .filtered {

  //2
  switch filter {

  // 3
  default:
    mask = depthFilters.createHighPassMask(for: depthMap,
                                           withFocus: sliderValue,
                                           andScale: scale)
  }  
}

In this code:

  1. You only create a mask if the Mask or the Filtered segments are active – good on you!
  2. You then switch on the type of filter selected (at the top of the iPhone screen).
  3. Finally, you create a high pass mask as the default case. You’ll fill out other cases soon.

Note: A high pass and a band-pass mask are included with the starter project. These are similar to the ones created in Image Depth Maps Tutorial for iOS: Getting Started under the section Creating a Mask

You still need to hook up the mask to the UIImageView to see it.

Go back to the AVCaptureVideoDataOutputSampleBufferDelegate extension and look for the switch statement in captureOutput(_:didOutput:from:). Add the following case:

case .mask:
  previewImage = mask ?? image

Build and run the project, and tap the Mask segment.

Build and run with mask

As you drag the slider to the left, more of the screen turns white. That’s because you implemented a high pass mask,

Good job! You laid the groundwork for the most exciting part of this tutorial: the filters!

Comic Background Effect

The iOS SDK comes bundled with a bunch of Core Image filters. One that particularly stands out is CIComicEffect. This filter gives an image a printed comic look.

Comic filter off Comic filter on

You’re going to use this filter to turn the background of your video stream into a comic.

Open DepthImageFilters.swift. This class is where all your masks and filters go.

Add the following method to the DepthImageFilters class:

func comic(image: CIImage, mask: CIImage) -> CIImage {

  // 1
  let bg = image.applyingFilter("CIComicEffect")

  // 2
  let filtered = image.applyingFilter("CIBlendWithMask",
                                      parameters: ["inputBackgroundImage": bg,
                                                   "inputMaskImage": mask])

  // 3
  return filtered
}

To break it down:

  1. You apply the CIComicEffect to the input image.
  2. Then you blend the original image with the comic image using the input mask.
  3. Finally, you return the filtered image.

Now, to use the filter, open DepthVideoViewController.swift and find captureOutput(_:didOutput:from:). Remove the default case on the switch statement and add the following case:

case .filtered:

  // 1
  if let mask = mask {

    // 2
    switch filter {

    // 3
    case .comic:
      previewImage = depthFilters.comic(image: image, mask: mask)

    // 4
    default:
      previewImage = image
    }
  } else {

    // 5
    previewImage = image
  }

This code is straightforward. Here’s what’s going on:

  1. You check to see if there is a mask, because you can’t filter without a mask!
  2. You switch on the filter selected in the UI.
  3. If the selected filter is comic, you create a new image based on your comic filter and make that the preview image.
  4. Otherwise, you just keep the video image unchanged.
  5. Finally, you handle the case where mask is nil.

Before you run the code, there’s one more thing you should do to make adding future filters easier.

Find depthDataOutput(_:didOutput:timestamp:connection), and add the following case to the switch filter statement:

case .comic:
  mask = depthFilters.createHighPassMask(for: depthMap,
                                         withFocus: sliderValue,
                                         andScale: scale)

Here, you create a high pass mask.

This looks exactly the same as the default case. After you add the other filters, you’ll be removing the default case, so it is best to make sure the comic case is in there now.

Go ahead. I know you’re excited to run this. Build and run the project and tap the Filtered segment.

Build and run with comic filter

Fantastic work! Are you feeling like a superhero in a comic book?

No Green Screen? No Problem!

That’s good and all, but maybe you don’t want to work on superhero movies? Perhaps you prefer science fiction instead?

No worries. This next filter will have you jumping for joy on the Moon! For that, you’ll need to create a makeshift green screen effect.

Open DepthImageFilters.swift and add the following method to the class:

func greenScreen(image: CIImage, background: CIImage, mask: CIImage) -> CIImage {

  // 1
  let crop = CIVector(x: 0,
                      y: 0,
                      z: image.extent.size.width,
                      w: image.extent.size.height)

  // 2
  let croppedBG = background.applyingFilter("CICrop",
                                            parameters: ["inputRectangle": crop])

  // 3
  let filtered = image.applyingFilter("CIBlendWithMask",
                                      parameters: ["inputBackgroundImage": croppedBG,
                                                   "inputMaskImage": mask])

  // 4
  return filtered
}

In this filter:

  1. You create a 4D CIVector to define a cropping boundary equal to your input image.
  2. Then you crop the background image to be the same size as the input image – important for the next step
  3. Next, you combine the input and background images by blending them based on the mask parameter.
  4. Finally, you return the filtered image

Now you just need to hook up the mask and filter logic for this back in DepthVideoViewController.swift and you’ll be ready to go.

Find captureOutput(_:didOutput:from:) in DepthVideoViewController.swift and add the following case to the switch filter statement:

case .greenScreen:

  // 1
  if let background = background {

    // 2
    previewImage = depthFilters.greenScreen(image: image,
                                            background: background,
                                            mask: mask)
  } else {

    // 3
    previewImage = image
  }

Here:

  1. You make sure the background image exists. It is created in viewDidLoad().
  2. If it exists, filter the input image with the background and the mask using the new function you just wrote.
  3. Otherwise, just use the input video image.

Next, find depthDataOutput(_:didOutput:timestamp:connection) and add the following case to the switch statement:

case .greenScreen:
  mask = depthFilters.createHighPassMask(for: depthMap,
                                         withFocus: sliderValue,
                                         andScale: scale,
                                         isSharp: true)

This code creates a high pass mask but makes the cutoff sharper (steeper slope).

Build and run the project. Move the slider around and see what objects you can put on the Moon.

Build and run with green screen

Out of this world!

Dream-like Blur Effect

Ok, ok. Maybe you don’t like the superhero or science fiction genres. I get it. You’re more of an art film type person. If so, this next filter is right up your alley.

With this filter, you’re going to blur out anything besides objects at a narrowly defined distance from the camera. This can give a dream-like feeling to your films.

Go back to DepthImageFilters.swift and add a new function to the class:

func blur(image: CIImage, mask: CIImage) -> CIImage {

  // 1
  let blurRadius: CGFloat = 10

  // 2
  let crop = CIVector(x: 0,
                      y: 0,
                      z: image.extent.size.width,
                      w: image.extent.size.height)

  // 3
  let invertedMask = mask.applyingFilter("CIColorInvert")

  // 4
  let blurred = image.applyingFilter("CIMaskedVariableBlur",
                                     parameters: ["inputMask": invertedMask,
                                                  "inputRadius": blurRadius])

  // 5
  let filtered = blurred.applyingFilter("CICrop",
                                        parameters: ["inputRectangle": crop])

  // 6
  return filtered
}

This one is a bit more complicated, but here’s what you did:

  1. You define a blur radius to use – the bigger the radius, the more the blur and the slower it will be!
  2. Once again, you create a 4D CIVector to define a cropping region. This is because blurring will effectively grow the image at the edges and you just want the original size.
  3. Then you invert the mask because the blur filter you’re using blurs where the mask is white.
  4. Next, you apply the CIMaskedVariableBlur filter to the image using the inverted mask and the blur radius as parameters.
  5. You crop the blurred image to maintain the desired size.
  6. Finally, you return the filtered image.

By now, you should know the drill.

Open DepthVideoViewController.swift and add a new case to the switch statement inside captureOutput(_:didOutput:from:):

case .blur:
  previewImage = depthFilters.blur(image: image, mask: mask)

This will create the blur filter when selected in the UI. While you’re there, you can delete the default case, as the switch filter statement is now exhaustive.

Now for the mask.

Replace the default case with the following case to the switch statement inside depthDataOutput(_:didOutput:timestamp:connection):

case .blur:
  mask = depthFilters.createBandPassMask(for: depthMap,
                                         withFocus: sliderValue,
                                         andScale: scale)

Here you create a band pass mask for the blur filter to use.

It’s time! Build and run this project. Try adjusting the sliders in the Mask & Filtered segments as well as changing the filters to see what effects you can create.

Build and run with blur

It’s so dreamy!

Where to Go From Here?

You have accomplished so much in this video depth maps tutorial. Give your self a well-deserved pat on the back.

If you want, you can download the final project using the link at the top or bottom of this project.

With your new knowledge, you can take this project even further. For instance, the app doesn’t record the filtered video stream; it just displays it. Try adding a button and some logic to save your masterpieces.

You can also add more filters or create your own! Check here for a complete list of CIFilters that are shipped with iOS.

We hope you enjoyed this video depth maps tutorial. If you have any questions or comments, please join the forum discussion below!

Download Materials

Team

Each tutorial at www.raywenderlich.com is created by a team of dedicated developers so that it meets our high quality standards. The team members who worked on this tutorial are:

Yono Mittlefehldt

Yono is an indie app and game developer. He is a co-creater of Gus on the Go, a language learning app for kids. Yono speaks 3 languages fluently and 3 more at a novice level. He has been programming since he was 7 years old, because he enjoys it.

Basically, he likes languages and coding. That's his jam.

You can find him on Twitter and sometimes writing blog posts at yonomitt.com

raywenderlich.com Weekly

Sign up to receive the latest tutorials from raywenderlich.com each week, and receive a free epic-length tutorial as a bonus!

Advertise with Us!

PragmaConf 2016 Come check out Alt U

Our Books

Our Team

Video Team

... 27 total!

iOS Team

... 81 total!

Android Team

... 39 total!

Unity Team

... 16 total!

Articles Team

... 4 total!

Resident Authors Team

... 30 total!

Podcast Team

... 7 total!

Recruitment Team

... 8 total!