How to Play, Record, and Merge Videos in iOS and Swift

Learn the basics of working with videos on iOS with AV Foundation in this tutorial. You’ll play, record, and even do some light video editing!


  • Swift 4, iOS 11, Xcode 9
Update note: This tutorial has been updated to iOS 11 and Swift 4 by Owen Brown. The original tutorial was written by Abdul Azeem with fixes and clarifications made by Joseph Neuman.

Recording videos, and playing around with them programmatically, is one of the coolest things you can do with your phone, but not nearly enough apps make use of it. To do this requires the AV Foundation framework that has been a part of macOS since OS X Lion (10.7), and iOS since iOS 4 in 2010.

AV Foundation has grown considerably since then, with well over 100 classes now. This tutorial covers media playback and some light editing to get you started with AV Foundation. In particular, you’ll learn how to:

  • Select and play a video from the media library.
  • Record and save a video to the media library.
  • Merge multiple videos together into a combined video, complete with a custom soundtrack!

I don’t recommend running the code in this tutorial on the simulator, because you’ll have no way to capture video. Plus, you’ll need to figure out a way to add videos to the media library manually. In other words, you really need to test this code on a device! To do that you’ll need to be a registered Apple developer. A free account will work just fine for this tutorial.

Ready? Lights, cameras, action!

Getting Started

Start by downloading the materials for this tutorial (you can find a link at the top or bottom of this tutorial). This project contains a storyboard and several view controllers with the UI for a simple video playback and recording app.

The main screen contains the three buttons below that segue to other view controllers:

  • Select and Play Video
  • Record and Save Video
  • Merge Video

Build and run the project, and test out the buttons; only the three buttons on the initial scene do anything, but you will change that soon!

Select and Play Video

The “Select and Play Video” button on the main screen segues to PlayVideoController. In this section of the tutorial, you’ll add the code to select a video file and play it.

Start by opening PlayVideoViewController.swift, and add the following import statements at the top of the file:

import AVKit
import MobileCoreServices

Importing AVKit gives you access to the AVPlayer object that plays the selected video. MobileCoreServices contains predefined constants such as kUTTypeMovie, which you’ll need when selecting videos.

Next, scroll down to the end of the file and add the following class extensions. Make sure you add these to the very bottom of the file, outside the curly braces of the class declaration:

// MARK: - UIImagePickerControllerDelegate
extension PlayVideoViewController: UIImagePickerControllerDelegate {

// MARK: - UINavigationControllerDelegate
extension PlayVideoViewController: UINavigationControllerDelegate {

These extensions set up the PlayVideoViewController to adopt the UIImagePickerControllerDelegate and UINavigationControllerDelegate protocols. You’ll be using the system-provided UIImagePickerController to allow the user to to browse videos in the photo library, and that class communicates back to your app through these delegate protocols. Although the class is named “image picker”, rest assured it works with videos too!

Next, head back to PlayVideoViewController‘s main class definition and add a call to helper method from VideoHelper to open the image picker. Later, you’ll add helper tools of your own in VideoHelper. Add the following code to playVideo(_:):

VideoHelper.startMediaBrowser(delegate: self, sourceType: .savedPhotosAlbum)

In the code above, you ensure that tapping Play Video will open the UIImagePickerController, allowing the user to select a video file from the media library.

To see what’s under the hood of this method, open VideoHelper.swift. It does the following:

  1. Check if the .savedPhotosAlbum source is available on the device. Other sources are the camera itself and the photo library. This check is essential whenever you use a UIImagePickerController to pick media. If you don’t do it, you might try to pick media from a non-existent media library, resulting in crashes or other unexpected issues.
  2. If the source you want is available, it creates a UIImagePickerController object and set its source and media type.
  3. Finally, it presents the UIImagePickerController modally.

Now you’re ready to give your project another whirl! Build and run. Tap Select and Play Video on the first screen, and then tap Play Video on the second screen, you should see your videos presented similar to the following screenshot.


Once you see the list of videos, select one. You’ll be taken to another screen that shows the video in detail, along with buttons to cancel, play and choose. If you tap the play button the video will play. However, if you tap the choose button, the app just returns to the Play Video screen! This is because you haven’t implemented any delegate methods to handle choosing a video from the picker.

Back in Xcode, scroll down to the UIImagePickerControllerDelegate class extension in PlayVideoViewController.swift and add the following delegate method implementation:

func imagePickerController(_ picker: UIImagePickerController, 
                           didFinishPickingMediaWithInfo info: [String : Any]) {
  // 1
    let mediaType = info[UIImagePickerControllerMediaType] as? String,
    mediaType == (kUTTypeMovie as String),
    let url = info[UIImagePickerControllerMediaURL] as? URL
    else { 
  // 2
  dismiss(animated: true) {
    let player = AVPlayer(url: url)
    let vcPlayer = AVPlayerViewController()
    vcPlayer.player = player
    self.present(vcPlayer, animated: true, completion: nil)

Here’s what you’re doing in this method:

  1. You get the media type of the selected media and URL. You ensure it’s type movie.
  2. You dismiss the image picker.
  3. In the completion block, you create an AVPlayerViewController to play the media.

Build and run. Tap Select and Play Video, then Play Video, and choose a video from the list. You should be able to see the video playing in the media player.



Record and Save Video

Now that you have video playback working, it’s time to record a video using the device’s camera and save it to the media library.

Open RecordVideoViewController.swift, and add the following import:

import MobileCoreServices

You’ll also need to adopt the same protocols as PlayVideoViewController, by adding the following to the end of the file:

extension RecordVideoViewController: UIImagePickerControllerDelegate {

extension RecordVideoViewController: UINavigationControllerDelegate {

Add the following code to record(_:):

VideoHelper.startMediaBrowser(delegate: self, sourceType: .camera)

It uses the same helper method as in PlayVideoViewController, but it accesses the .camera instead to record video.

Build and run to see what you’ve got so far.

Go to the Record screen and tap Record Video. Instead of the Photo Gallery, the camera UI opens. When the alert dialogue asks for camera permissions and mic permissions, click OK. Start recording a video by tapping the red record button at the bottom of the screen, and tap it again when you’re done recording.


Now you can opt to use the recorded video or do a retake. Tap Use Video. You’ll notice that it just dismisses the view controller. That’s because — you guessed it — you haven’t implemented an appropriate delegate method to save the recorded video to the media library.

Add the following method to the UIImagePickerControllerDelegate class extension at the bottom:

func imagePickerController(_ picker: UIImagePickerController, 
                           didFinishPickingMediaWithInfo info: [String : Any]) {
  dismiss(animated: true, completion: nil)
    let mediaType = info[UIImagePickerControllerMediaType] as? String,
    mediaType == (kUTTypeMovie as String),
    let url = info[UIImagePickerControllerMediaURL] as? URL,
    else {
  // Handle a movie capture

Don’t worry about the error on that last line of code, you’ll take care of that shortly.

As before, the delegate method gives you a URL pointing to the video. You verify that the app can save the file to the device’s photo album, and if so, save it.

UISaveVideoAtPathToSavedPhotosAlbum is the function provided by the SDK to save videos to the Photos Album. As parameters, you pass the path to the video you want to save as well as a target and action to call back, which will inform you of the status of the save operation.

Add the implementation of the callback to the main class definition next:

@objc func video(_ videoPath: String, didFinishSavingWithError error: Error?, contextInfo info: AnyObject) {
  let title = (error == nil) ? "Success" : "Error"
  let message = (error == nil) ? "Video was saved" : "Video failed to save"
  let alert = UIAlertController(title: title, message: message, preferredStyle: .alert)
  alert.addAction(UIAlertAction(title: "OK", style: UIAlertActionStyle.cancel, handler: nil))
  present(alert, animated: true, completion: nil)

The callback method simply displays an alert to the user, announcing whether the video file was saved or not, based on the error status.

Build and run. Record a video and select Use Video when you’re done recording. If you’re asked for permission to save to your video library, tap OK. If the “Video was saved” alert pops up, you just successfully saved your video to the photo library!


Now that you can play videos and record videos, it’s time to take the next step and try some light video editing.

Merging Videos

The final piece of functionality for the app is to do a little editing. Your user will select two videos and a song from the music library, and the app will combine the two videos and mix in the music.

The project already has a starter implementation in MergeVideoViewController.swift. The code here is similar to the code you wrote to play a video. The big difference is when merging, the user needs to select two videos. That part is already set up, so the user can make two selections that will be stored in firstAsset and secondAsset.

The next step is to add the functionality to select the audio file.

The UIImagePickerController only provides functionality to select video and images from the media library. To select audio files from your music library, you will use the MPMediaPickerController. It works essentially the same as UIImagePickerController, but instead of images and video, it accesses audio files in the media library.

Open MergeVideoViewController.swift and add the following code to loadAudio(_:):

let mediaPickerController = MPMediaPickerController(mediaTypes: .any)
mediaPickerController.delegate = self
mediaPickerController.prompt = "Select Audio"
present(mediaPickerController, animated: true, completion: nil)

The above code creates a new MPMediaPickerController instance and displays it as a modal view controller.

Build and run. Now tap Merge Video, then Load Audio to access the audio library on your device. Of course, you’ll need some audio files on your device. Otherwise, the list will be empty. The songs will also have to be physically present on the device, so make sure you’re not trying to load a song from the cloud.


If you select a song from the list, you’ll notice that nothing happens. That’s right! MPMediaPickerController needs delegate methods! Find the MPMediaPickerControllerDelegate class extension at the bottom of the file and add the following two methods to it:

func mediaPicker(_ mediaPicker: MPMediaPickerController, 
                 didPickMediaItems mediaItemCollection: MPMediaItemCollection) {
  dismiss(animated: true) {
    let selectedSongs = mediaItemCollection.items
    guard let song = selectedSongs.first else { return }
    let url = song.value(forProperty: MPMediaItemPropertyAssetURL) as? URL
    self.audioAsset = (url == nil) ? nil : AVAsset(url: url!)
    let title = (url == nil) ? "Asset Not Available" : "Asset Loaded"
    let message = (url == nil) ? "Audio Not Loaded" : "Audio Loaded"
    let alert = UIAlertController(title: title, message: message, preferredStyle: .alert)
    alert.addAction(UIAlertAction(title: "OK", style: .cancel, handler:nil))
    self.present(alert, animated: true, completion: nil)

func mediaPickerDidCancel(_ mediaPicker: MPMediaPickerController) {
  dismiss(animated: true, completion: nil)

The code is very similar to the delegate methods for UIImagePickerController. You set the audio asset based on the media item selected via the MPMediaPickerController after ensuring it’s a valid media item. Note that it’s important to only present new view controllers after dismissing the current one, which is why you wrapped the code above inside the completion handler.

Build and run. Go to the Merge Videos screen. Select an audio file and if there are no errors, you should see the “Audio Loaded” message.


You now have all your assets loading correctly. It’s time to merge the various media files into one file. But before you get into that code, you must do a little bit of set up.

Export and Merge

The code to merge your assets will require a completion handler to export the final video to the photos album.
Add the code below to MergeVideoViewController:

func exportDidFinish(_ session: AVAssetExportSession) {
  // Cleanup assets
  firstAsset = nil
  secondAsset = nil
  audioAsset = nil
    session.status == AVAssetExportSessionStatus.completed,
    let outputURL = session.outputURL 
    else {
  let saveVideoToPhotos = {
      PHAssetChangeRequest.creationRequestForAssetFromVideo(atFileURL: outputURL)
    }) { saved, error in
      let success = saved && (error == nil)
      let title = success ? "Success" : "Error"
      let message = success ? "Video saved" : "Failed to save video"
      let alert = UIAlertController(title: title, message: message, preferredStyle: .alert)
      alert.addAction(UIAlertAction(title: "OK", style: UIAlertActionStyle.cancel, handler: nil))
      self.present(alert, animated: true, completion: nil)
  // Ensure permission to access Photo Library
  if PHPhotoLibrary.authorizationStatus() != .authorized {
    PHPhotoLibrary.requestAuthorization { status in
      if status == .authorized {
  } else {

Once the export completes successfully, the above code saves the newly exported video to the photo album. You could just display the output video in an AssetBrowser, but it’s easier to copy the output video to the photo album so you can see the final output.

Now, add the following code to merge(_:):

  let firstAsset = firstAsset, 
  let secondAsset = secondAsset 
  else {


// 1 - Create AVMutableComposition object. This object will hold your AVMutableCompositionTrack instances.
let mixComposition = AVMutableComposition()

// 2 - Create two video tracks
  let firstTrack = mixComposition.addMutableTrack(withMediaType:, 
                                                  preferredTrackID: Int32(kCMPersistentTrackID_Invalid)) 
  else {
do {
  try firstTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, firstAsset.duration), 
                                 of: firstAsset.tracks(withMediaType:[0], 
                                 at: kCMTimeZero)
} catch {
  print("Failed to load first track")

  let secondTrack = mixComposition.addMutableTrack(withMediaType:, 
                                                   preferredTrackID: Int32(kCMPersistentTrackID_Invalid))
  else {
do {
  try secondTrack.insertTimeRange(CMTimeRangeMake(kCMTimeZero, secondAsset.duration), 
                                  of: secondAsset.tracks(withMediaType:[0], 
                                  at: firstAsset.duration)
} catch {
  print("Failed to load second track")

// 3 - Audio track
if let loadedAudioAsset = audioAsset {
  let audioTrack = mixComposition.addMutableTrack(withMediaType:, preferredTrackID: 0)
  do {
    try audioTrack?.insertTimeRange(CMTimeRangeMake(kCMTimeZero, 
                                    of: loadedAudioAsset.tracks(withMediaType:[0] ,
                                    at: kCMTimeZero)
  } catch {
    print("Failed to load Audio track")

// 4 - Get path
guard let documentDirectory = FileManager.default.urls(for: .documentDirectory, 
                                                       in: .userDomainMask).first else {
let dateFormatter = DateFormatter()
dateFormatter.dateStyle = .long
dateFormatter.timeStyle = .short
let date = dateFormatter.string(from: Date())
let url = documentDirectory.appendingPathComponent("mergeVideo-\(date).mov")

// 5 - Create Exporter
guard let exporter = AVAssetExportSession(asset: mixComposition, 
                                          presetName: AVAssetExportPresetHighestQuality) else {
exporter.outputURL = url
exporter.outputFileType =
exporter.shouldOptimizeForNetworkUse = true

// 6 - Perform the Export
exporter.exportAsynchronously() {
  DispatchQueue.main.async {

Here’s a step-by-step breakdown of the above code:

  1. You create an AVMutableComposition object to hold your video and audio tracks and transform effects.
  2. Next, you create an AVMutableCompositionTrack for the video and add it to your AVMutableComposition object. Then you insert your two videos to the newly created AVMutableCompositionTrack.

    Note that insertTimeRange(_:ofTrack:atStartTime:) allows you to insert a part of a video into your main composition instead of the whole video. This way, you can trim the video to a time range of your choosing.

    In this instance, you want to insert the whole video, so you create a time range from kCMTimeZero to your video asset duration. The atStartTime parameter allows you to place your video/audio track wherever you want it in your composition. Notice how the code inserts firstAsset at time zero, and it inserts secondAsset at the end of the first video. This tutorial assumes you want your video assets one after the other. But you can also overlap the assets by playing with the time ranges.

    For working with time ranges, you use CMTime structs. CMTime structs are non-opaque mutable structs representing times, where the time could be a timestamp or a duration.

  3. Similarly, you create a new track for your audio and add it to the main composition. This time you set the audio time range to the sum of the duration of the first and second videos, since that will be the complete length of your video.
  4. Before you can save the final video, you need a path for the saved file. So create a unique file name (based upon the current date) that points to a file in the documents folder.
  5. Finally, render and export the merged video. To do this, you create an AVAssetExportSession object that transcodes the contents of an AVAsset source object to create an output of the form described by a specified export preset.
  6. After you’ve initialized an export session with the asset that contains the source media, the export preset name (presetName), and the output file type (outputFileType), you start the export running by invoking exportAsynchronously(). Because the code performs the export asynchronously, this method returns immediately. The code calls the completion handler you supply to exportAsynchronously() whether the export fails, completes, or the user canceled. Upon completion, the exporter’s status property indicates whether the export has completed successfully. If it has failed, the value of the exporter’s error property supplies additional information about the reason for the failure.

An AVComposition instance combines media data from multiple file-based sources. At its top level, an AVComposition is a collection of tracks, each presenting media of a specific type such as audio or video. An instance of AVCompositionTrack represents a single track.

Similarly, AVMutableComposition and AVMutableCompositionTrack also present a higher-level interface for constructing compositions. These objects offer insertion, removal, and scaling operations that you’ve seen already and will come up again.

Go ahead, build and run your project!

Select two videos and an audio files and merge the selected files. If the merge was successful, you should see a “Video Saved” message. At this point, your new video should be present in the photo album.


Go to the photo album, or browse using the Select and Play Video screen within the app. You’ll might notice that although the app merged the videos, there are some orientation issues. Portrait video is in landscape mode, and sometimes videos are turned upside down.


This is due to the default AVAsset orientation. All movie and image files recorded using the default iPhone camera application have the video frame set to landscape, and so the iPhone saves the media in landscape mode.

Video Orientation

AVAsset has a preferredTransform property that contains the media orientation information, and it applies this to a media file whenever you view it using the Photos app or QuickTime. In the code above, you haven’t applied a transform to your AVAsset objects, hence the orientation issue.

You can correct this easily by applying the necessary transforms to your AVAsset objects. But as your two video files can have different orientations, you’ll need to use two separate AVMutableCompositionTrack instances instead of one as you originally did.

Before you can do this, add the following helper method to VideoHelper:

static func orientationFromTransform(_ transform: CGAffineTransform) 
  -> (orientation: UIImageOrientation, isPortrait: Bool) {
  var assetOrientation = UIImageOrientation.up
  var isPortrait = false
  if transform.a == 0 && transform.b == 1.0 && transform.c == -1.0 && transform.d == 0 {
    assetOrientation = .right
    isPortrait = true
  } else if transform.a == 0 && transform.b == -1.0 && transform.c == 1.0 && transform.d == 0 {
    assetOrientation = .left
    isPortrait = true
  } else if transform.a == 1.0 && transform.b == 0 && transform.c == 0 && transform.d == 1.0 {
    assetOrientation = .up
  } else if transform.a == -1.0 && transform.b == 0 && transform.c == 0 && transform.d == -1.0 {
    assetOrientation = .down
  return (assetOrientation, isPortrait)

This code analyzes an affine transform to determine the input video’s orientation.

Next, add one more helper method to the class:

static func videoCompositionInstruction(_ track: AVCompositionTrack, asset: AVAsset) 
  -> AVMutableVideoCompositionLayerInstruction {
  let instruction = AVMutableVideoCompositionLayerInstruction(assetTrack: track)
  let assetTrack = asset.tracks(withMediaType: .video)[0]
  let transform = assetTrack.preferredTransform
  let assetInfo = orientationFromTransform(transform)
  var scaleToFitRatio = UIScreen.main.bounds.width / assetTrack.naturalSize.width
  if assetInfo.isPortrait {
    scaleToFitRatio = UIScreen.main.bounds.width / assetTrack.naturalSize.height
    let scaleFactor = CGAffineTransform(scaleX: scaleToFitRatio, y: scaleToFitRatio)
    instruction.setTransform(assetTrack.preferredTransform.concatenating(scaleFactor), at: kCMTimeZero)
  } else {
    let scaleFactor = CGAffineTransform(scaleX: scaleToFitRatio, y: scaleToFitRatio)
    var concat = assetTrack.preferredTransform.concatenating(scaleFactor)
      .concatenating(CGAffineTransform(translationX: 0, y: UIScreen.main.bounds.width / 2))
    if assetInfo.orientation == .down {
      let fixUpsideDown = CGAffineTransform(rotationAngle: CGFloat(Double.pi))
      let windowBounds = UIScreen.main.bounds
      let yFix = assetTrack.naturalSize.height + windowBounds.height
      let centerFix = CGAffineTransform(translationX: assetTrack.naturalSize.width, y: yFix)
      concat = fixUpsideDown.concatenating(centerFix).concatenating(scaleFactor)
    instruction.setTransform(concat, at: kCMTimeZero)
  return instruction

This method takes a track and asset, and returns a AVMutableVideoCompositionLayerInstruction which wraps the affine transform needed to get the video right side up. Here’s what’s going on, step-by-step:

  • You create an AVMutableVideoCompositionLayerInstruction and associate it with your firstTrack.
  • Next, you create an AVAssetTrack object from your AVAsset. An AVAssetTrack object provides the track-level inspection interface for all assets. You need this object in order to access the preferredTransform and dimensions of the asset.
  • Then, you save the preferred transform and the amount of scale required to fit the video to the current screen. You’ll use these values in the following steps.
  • If the video is in portrait, you need to recalculate the scale factor, since the default calculation is for videos in landscape. Then all you need to do is apply the orientation rotation and scale transforms.
  • If the video is an landscape, there are a similar set of steps to apply the scale and transform. There’s one extra check since the video could have been produced in either landscape left or landscape right. Because there are “two landscapes” the aspect ratio will match but it’s possible the video will be rotated 180 degrees. The extra check for a video orientation of .Down will handle this case.

With the helper methods set up, find merge(_:) and insert the following between sections #2 and #3:

// 2.1
let mainInstruction = AVMutableVideoCompositionInstruction()
mainInstruction.timeRange = CMTimeRangeMake(kCMTimeZero, 
                                            CMTimeAdd(firstAsset.duration, secondAsset.duration))

// 2.2
let firstInstruction = VideoHelper.videoCompositionInstruction(firstTrack, asset: firstAsset)
firstInstruction.setOpacity(0.0, at: firstAsset.duration)
let secondInstruction = VideoHelper.videoCompositionInstruction(secondTrack, asset: secondAsset)

// 2.3
mainInstruction.layerInstructions = [firstInstruction, secondInstruction]
let mainComposition = AVMutableVideoComposition()
mainComposition.instructions = [mainInstruction]
mainComposition.frameDuration = CMTimeMake(1, 30)
mainComposition.renderSize = CGSize(width: UIScreen.main.bounds.width, height: UIScreen.main.bounds.height)

First, you set up two separate AVMutableCompositionTrack instances. That means you need to apply an AVMutableVideoCompositionLayerInstruction to each track in order to fix the orientation separately.

2.1: First, you set up mainInstruction to wrap the entire set of instructions. Note that the total time here is the sum of the first asset’s duration and the second asset’s duration.

2.2: Next, you set up the two instructions — one for each asset — using the helper method you defined earlier. The instruction for the first video needs one extra addition: you set its opacity to 0 at the end so it becomes invisible when the second video starts.

2.3: Now that you have your AVMutableVideoCompositionLayerInstruction instances for the first and second tracks, you simply add them to the main AVMutableVideoCompositionInstruction object. Next, you add your mainInstruction object to the instructions property of an instance of AVMutableVideoComposition. You also set the frame rate for the composition to 30 frames/second.

Now that you’ve got an AVMutableVideoComposition object configured, all you need to do is assign it to your exporter. Insert the following code at the end of of section #5 (just before exportAsynchronously()::

exporter.videoComposition = mainComposition

Whew – that’s it!

Build and run your project. If you create a new video by combining two videos (and optionally an audio file), you will see that the orientation issues disappear when you play back the new merged video.


Where to Go From Here?

You can download the final project using the link at the top or bottom of this tutorial.

If you followed along, you should now have a good understanding of how to play video, record video, and merge multiple videos and audio in your apps.

AV Foundation gives you a lot of flexibility when playing around with videos. You can also apply any kind of CGAffineTransform to merge, scale, or position videos.

If you haven’t already done so, I would recommend that you have a look at the WWDC videos on AV Foundation, such as WWDC 2016 session 503 Advanced in AV Foundation Playback. Also, be sure to check out the Apple AV Foundation Framework documentation.

I hope this tutorial has been useful to get you started with video manipulation in iOS. If you have any questions, comments, or suggestions for improvement, please join the forum discussion below!