AVAudioEngine Tutorial for iOS: Getting Started

Learn how to use AVAudioEngine to build the next greatest podcasting app! Implement audio features to pause, skip, speed up, slow down and change the pitch of audio in your app. By Ryan Ackermann.

4.5 (8) · 5 Reviews

Download materials
Save for later
Share
Update note: Ryan Ackermann updated this tutorial for iOS 14, Xcode 12 and Swift 5. Scott McAlister wrote the original.

Mention audio processing to most iOS developers, and they’ll give you a look of fear and trepidation. That’s because, prior to iOS 8, it meant diving into the depths of the low-level Core Audio framework — a trip only a few brave souls dared to make. Thankfully, that all changed in 2014 with the release of iOS 8 and AVAudioEngine. This AVAudioEngine tutorial will show you how to use Apple’s new, higher-level audio toolkit to make audio processing apps without the need to dive into Core Audio.

That’s right! No longer do you need to search through obscure pointer-based C/C++ structures and memory buffers to gather your raw audio data. If you understand basic Swift code, this tutorial will guide you through the adding audio features to an app.

Swifty looking concerned

In this tutorial, you’ll use AVAudioEngine to build the next great podcasting app: Raycast. :]

The features you’ll implement in this app are:

  • Play a local audio file.
  • View the playback progress.
  • Observe the audio signal level with a VU meter.
  • Skip forward or backward.
  • Change the playback rate and pitch.

When you’re done, you’ll have a fantastic app for listening to podcasts and audio files.

short demo animation of the Raycast app

Getting Started

Download the starter project by clicking the Download Materials button at the top or bottom of the tutorial.

Build and run your project in Xcode, and you’ll see the basic UI:

screenshot of the starter project app

The controls don’t do anything yet. In fact, they’re disabled for now since the audio isn’t ready to play. However, the controls are set up to call their respective view model methods that you’ll implement.

Understanding iOS Audio Frameworks

Before jumping into the project, here’s a quick overview of the iOS Audio frameworks:

chart of different audio tools

adding Player and Effect to AVAudioEngine

  • CoreAudio and AudioToolbox are the low-level C frameworks.
  • AVFoundation is an Objective-C/Swift framework.
  • AVAudioEngine is a part of AVFoundation.
  • AVAudioEngine is a class that defines a group of connected audio nodes. You’ll add two nodes to the project: AVAudioPlayerNode and AVAudioUnitTimePitch.

By utilizing these frameworks, you can avoid delving into the low-level processing of audio information and focus on the higher-level features you want to add to your app.

Setting up Audio

Open Models/PlayerViewModel.swift and look inside. At the top, under Public properties, you’ll see all the properties used in the view to lay out the audio player. The methods used to make the player are provided for you to fill out.

Add the following code to setupAudio():

// 1
guard let fileURL = Bundle.main.url(
  forResource: "Intro",
  withExtension: "mp3")
else {
  return
}

do {
  // 2
  let file = try AVAudioFile(forReading: fileURL)
  let format = file.processingFormat
  
  audioLengthSamples = file.length
  audioSampleRate = format.sampleRate
  audioLengthSeconds = Double(audioLengthSamples) / audioSampleRate
  
  audioFile = file
  
  // 3
  configureEngine(with: format)
} catch {
  print("Error reading the audio file: \(error.localizedDescription)")
}

Take a closer look at what’s happening:

  1. This gets the URL of the audio file included in the app bundle.
  2. The audio file is transformed into an AVAudioFile and a few properties are extracted from the file’s metadata.
  3. The final step to prepare an audio file for playback is to set up the audio engine.

Add this code to configureEngine(with:):

// 1
engine.attach(player)
engine.attach(timeEffect)

// 2
engine.connect(
  player,
  to: timeEffect,
  format: format)
engine.connect(
  timeEffect,
  to: engine.mainMixerNode,
  format: format)

engine.prepare()

do {
  // 3
  try engine.start()
  
  scheduleAudioFile()
  isPlayerReady = true
} catch {
  print("Error starting the player: \(error.localizedDescription)")
}

Going through this:

  1. Attach the player node to the engine, which you must do before connecting other nodes. These nodes will either produce, process or output audio. The audio engine provides a main mixer node that you connect to the player node. By default, the main mixer connects to the engine default output node, the iOS device speaker.
  2. Connect the player and time effect to the engine. prepare() preallocates needed resources.
  3. Start the engine, which prepares the device to play audio. The state is also updated to prepare the visual interface.

Next, add the following to scheduleAudioFile():

guard
  let file = audioFile,
  needsFileScheduled
else {
  return
}

needsFileScheduled = false
seekFrame = 0

player.scheduleFile(file, at: nil) {
  self.needsFileScheduled = true
}

This schedules the playing of the entire audio file. The parameter at: is the time — AVAudioTime — in the future you want the audio to play. Setting it to nil starts playback immediately. The file is only scheduled to play once. Tapping play again doesn’t restart it from the beginning. You’ll need to reschedule to play it again. When the audio file finishes playing, the flag, needsFileScheduled, is set in the completion block.

Other variants of scheduling audio for playback include:

  • scheduleBuffer(_:completionHandler:): This provides a buffer preloaded with the audio data.
  • scheduleSegment(_:startingFrame:frameCount:at:completionHandler:): This is like scheduleFile(_:at:), except you specify which audio frame to start playing from and how many frames to play.

Next, you’ll address user interaction. Add the following to playOrPause():

// 1
isPlaying.toggle()

if player.isPlaying {
  // 2
  player.pause()
} else {
  // 3
  if needsFileScheduled {
    scheduleAudioFile()
  }
  player.play()
}

Here’s what this is doing:

  1. The isPlaying property toggles to the next state which changes the Play/Pause button icon.
  2. If the player is currently playing, it’s paused.
  3. It resumes playback if the player is already paused. If needsFileScheduled is true, the audio needs to be rescheduled.

Build and run.

Tap play, and you should hear Ray’s lovely intro to The raywenderlich.com Podcast. :] But, there’s no UI feedback — you have no idea how long the file is or where you are in it.

Playing audio with UI feedback.

Adding Progress Feedback

Now that you can hear the audio, how do you go about seeing it? Well, transcriptions aren’t covered in this tutorial. However, you certainly can view the progress of the audio file!

Toward the bottom of Models/PlayerViewModel.swift, add the following to setupDisplayLink():

displayLink = CADisplayLink(target: self, selector: #selector(updateDisplay))
displayLink?.add(to: .current, forMode: .default)
displayLink?.isPaused = true
Hint: You can find methods and properties in a longer file like PlayerViewModel.swift by pressing Control-6 and typing part of the name you’re seeking!

CADisplayLink is a timer object that synchronizes with the display’s refresh rate. You instantiate it with the selector updateDisplay. Then, you add it to a run loop — in this case, the default run loop. Finally, it doesn’t need to start running yet, so set isPaused to true.

Replace the implementation of playOrPause() with the following:

isPlaying.toggle()

if player.isPlaying {
  displayLink?.isPaused = true
  disconnectVolumeTap()
  
  player.pause()
} else {
  displayLink?.isPaused = false
  connectVolumeTap()
  
  if needsFileScheduled {
    scheduleAudioFile()
  }
  player.play()
}

The key here is to pause or start the display link by setting displayLink?.isPaused when the player state changes. You’ll learn about connectVolumeTap() and disconnectVolumeTap() in the VU Meter section below.

Now, you need to implement the associated UI updates. Add the following to updateDisplay():

// 1
currentPosition = currentFrame + seekFrame
currentPosition = max(currentPosition, 0)
currentPosition = min(currentPosition, audioLengthSamples)

// 2
if currentPosition >= audioLengthSamples {
  player.stop()
  
  seekFrame = 0
  currentPosition = 0
  
  isPlaying = false
  displayLink?.isPaused = true
  
  disconnectVolumeTap()
}

// 3
playerProgress = Double(currentPosition) / Double(audioLengthSamples)

let time = Double(currentPosition) / audioSampleRate
playerTime = PlayerTime(
  elapsedTime: time,
  remainingTime: audioLengthSeconds - time
)

Here’s what’s going on:

  • Stop the player.
  • Reset the seek and current position properties.
  • Pause the display link and reset isPlaying.
  • Disconnect the volume tap.
  1. The property seekFrame is an offset, which is initially set to zero, added to or subtracted from currentFrame. Make sure currentPosition doesn’t fall outside the range of the file.
  2. If currentPosition is at the end of the file, then:
  3. Update playerProgress to the current position within the audio file. Compute time by dividing currentPosition by audioSampleRate of the audio file. Update playerTime, which is a struct that takes the two progress values as input.

The interface is already wired up to display playerProgress, elapsedTime, and remainingTime.

Build and run, then tap play/pause. Once again, you’ll hear Ray’s intro, but this time the progress bar and timer labels supply the missing status information.

A progress bar updated to the current playback time.