AVAudioEngine Tutorial for iOS: Getting Started

In this AVAudioEngine tutorial, you’ll learn how to add advanced audio functionality using Apple’s higher-level audio toolkit.

Version

  • Swift 4, iOS 11, Xcode 9

Mention audio processing to most iOS developers, and they’ll give you a look of fear and trepidation. That’s because, prior to iOS 8, it meant diving into the depths of the low-level Core Audio framework — a trip only a few brave souls dared to make. Thankfully, that all changed in 2014 with the release of iOS 8 and AVAudioEngine. This AVAudioEngine tutorial will show you how to use Apple’s new, higher level audio toolkit to make audio processing apps without needing to dive into Core Audio.

That’s right! No longer do you need to search through obscure pointer-based C/C++ structs and memory buffers to gather your raw audio data.

In this AVAudioEngine tutorial, you’ll use AVAudioEngine to build the next great podcasting app: Raycast. More specifically, you’ll add the audio functionality controlled by the UI: play/pause button, skip forward/back buttons, progress bar and playback rate selector. When you’re done, you’ll have a fantastic app for listening to Dru and Janie.

Getting Started

To get started, download the materials for this tutorial (you can find a link at the top or bottom of this tutorial). Build and run your project in Xcode, and you’ll see the basic UI.

The controls don’t do anything yet, but they’re all connected to IBOutlets and associated IBActions in the view controllers.

iOS Audio Framework Introduction

Before jumping into the project, here’s a quick overview of the iOS Audio frameworks:

  • CoreAudio and AudioToolbox are the low-level C frameworks.
  • AVFoundation is an Objective-C/Swift framework.
  • AVAudioEngine is a part of AVFoundation.
  • AVAudioEngine is a class that defines a group of connected audio nodes. You’ll be adding two nodes to the project: AVAudioPlayerNode and AVAudioUnitTimePitch.

Setup Audio

Open ViewController.swift and take a look inside. At the top, you’ll see all of the connected outlets and class variables. The actions are also connected to the appropriate outlets in the storyboard.

Add the following code to setupAudio():

// 1
audioFileURL = Bundle.main.url(forResource: "Intro", withExtension: "mp4")

// 2
engine.attach(player)
engine.connect(player, to: engine.mainMixerNode, format: audioFormat)
engine.prepare()

do {
  // 3
  try engine.start()
} catch let error {
  print(error.localizedDescription)
}

Take a closer look at what’s happening:

  1. This gets the URL of the bundle audio file. When set, it will instantiate audioFile in audioFileURL‘s didSet block in the variable declaration section above.
  2. Attach the player node to the engine, which you must do before connecting other nodes. These nodes will either produce, process or output audio. The audio engine provides a main mixer node that you connect to the player node. By default, the main mixer connects to the engine default output node (iOS device speaker). prepare() preallocates needed resources.

Next, add the following to scheduleAudioFile():

guard let audioFile = audioFile else { return }

skipFrame = 0
player.scheduleFile(audioFile, at: nil) { [weak self] in
  self?.needsFileScheduled = true
}

This schedules the playing of the entire audioFile. at: is the time (AVAudioTime) in the future you want the audio to play. Setting to nil starts playback immediately. The file is only scheduled to play once. Tapping the Play button again doesn’t restart it from the beginning. You’ll need to reschedule to play it again. When the audio file is done playing, the flag, needsFileScheduled, is set in the completion block.

There are other variants of scheduling audio for playback:

  • scheduleBuffer(AVAudioPCMBuffer, completionHandler: AVAudioNodeCompletionHandler? = nil): This provides a buffer preloaded with the audio data.
  • scheduleSegment(AVAudioFile, startingFrame: AVAudioFramePosition, frameCount: AVAudioFrameCount, at: AVAudioTime?, completionHandler: AVAudioNodeCompletionHandler? = nil): This is like scheduleFile except you specify which audio frame to start playing from and how many frames to play.

Then, add the following to playTapped(_:):

// 1
sender.isSelected = !sender.isSelected

// 2
if player.isPlaying {
  player.pause()
} else {
  if needsFileScheduled {
    needsFileScheduled = false
    scheduleAudioFile()
  }
  player.play()
}

Here’s the breakdown:

  1. Toggle the selection state of button, which changes the button image as set in storyboard.
  2. Use player.isPlaying to determine if the player currently playing. If so, pause it; if not, play. You also check needsFileScheduled and reschedule the file if required.

Build and run, then tap the playPauseButton. You should hear Ray’s lovely intro to The raywenderlich.com Podcast. :] But, there’s no UI feedback; you have no idea how long the file is or where you are in it.

Add Progress Feedback

Add the following to the end of viewDidLoad():

updater = CADisplayLink(target: self, selector: #selector(updateUI))
updater?.add(to: .current, forMode: .defaultRunLoopMode)
updater?.isPaused = true

CADisplayLink is a timer object that synchronizes with the display’s refresh rate. You instantiate it with the selector, updateUI . Then, you add it to a run loop — in this case, the default run loop. Finally, it doesn’t need to start running yet, so set isPaused to true.

Replace the implementation of playTapped(_:) with the following:

sender.isSelected = !sender.isSelected

if player.isPlaying {
  disconnectVolumeTap()
  updater?.isPaused = true
  player.pause()
} else {
  if needsFileScheduled {
    needsFileScheduled = false
    scheduleAudioFile()
  }
  connectVolumeTap()
  updater?.isPaused = false
  player.play()
}

The key thing here is to pause the UI with updater.isPaused = true when the player pauses. You’ll learn about connectVolumeTap() and disconnectVolumeTap() in the VU Meter section below.

Replace var currentFrame: AVAudioFramePosition = 0 with the following:

var currentFrame: AVAudioFramePosition {
  // 1
  guard
    let lastRenderTime = player.lastRenderTime,
    // 2
    let playerTime = player.playerTime(forNodeTime: lastRenderTime)
    else {
      return 0
  }
  
  // 3
  return playerTime.sampleTime
}

currentFrame returns the last audio sample rendered by player. Here’s a closer look:

  1. player.lastRenderTime returns the time in reference to engine start time. If engine is not running, lastRenderTime returns nil.
  2. player.playerTime(forNodeTime:) converts lastRenderTime to time relative to player start time. If player is not playing, then playerTime returns nil.
  3. sampleTime is time as a number of audio samples within the audio file.

Now for the UI updates. Add the following to updateUI():

// 1
currentPosition = currentFrame + skipFrame
currentPosition = max(currentPosition, 0)
currentPosition = min(currentPosition, audioLengthSamples)

// 2
progressBar.progress = Float(currentPosition) / Float(audioLengthSamples)
let time = Float(currentPosition) / audioSampleRate
countUpLabel.text = formatted(time: time)
countDownLabel.text = formatted(time: audioLengthSeconds - time)

// 3
if currentPosition >= audioLengthSamples {
  player.stop()
  updater?.isPaused = true
  playPauseButton.isSelected = false
  disconnectVolumeTap()
}

Let’s step through this:

  1. The property skipFrame is an offset added to or subtracted from currentFrame, initially set to zero. Make sure currentPosition doesn’t fall outside the range of the file.
  2. Update progressBar.progress to currentPosition within audioFile. Compute time by dividing currentPosition by sampleRate of audioFile. Update countUpLabel and countDownLabel text to current time within audioFile.
  3. If currentPosition is at the end of the file, then:
    • Stop the player.
    • Pause the timer.
    • Reset the playPauseButton selection state.
    • Disconnect the volume tap.

Build and run, then tap the playPauseButton. Once again, you’ll hear Ray’s intro, but this time the progressBar and timer labels supply the missing status information.

Implement the VU Meter

Now it’s time for you to add the VU Meter functionality. It’s a UIView positioned to fit between the pause icon’s bars. The height of the view determined by the average power of the playing audio. This is your first opportunity for some audio processing.

You’ll compute the average power on a 1k buffer of audio samples. A common way to determine the average power of a buffer of audio samples is to calculate the Root Mean Square (RMS) of the samples.

Average power is the representation, in decibels, of the average value of a range of audio sample data. There’s also peak power, which is the max value in a range of sample data.

Add the following helper method below connectVolumeTap():

func scaledPower(power: Float) -> Float {
  // 1
  guard power.isFinite else { return 0.0 }

  // 2
  if power < minDb {
    return 0.0
  } else if power >= 1.0 {
    return 1.0
  } else {
    // 3
    return (fabs(minDb) - fabs(power)) / fabs(minDb)
  }
}

scaledPower(power:) converts the negative power decibel value to a positive value that adjusts the volumeMeterHeight.constant value above. Here’s what it does:

  1. power.isFinite checks to make sure power is a valid value — i.e., not NaN — returning 0.0 if it isn’t.
  2. This sets the dynamic range of our vuMeter to 80db. For any value below -80.0, return 0.0. Decibel values on iOS have a range of -160db, near silent, to 0db, maximum power. minDb is set to -80.0, which provides a dynamic range of 80db. You can alter this value to see how it affects the vuMeter.
  3. Compute the scaled value between 0.0 and 1.0.

Now, add the following to connectVolumeTap():

// 1
let format = engine.mainMixerNode.outputFormat(forBus: 0)
// 2
engine.mainMixerNode.installTap(onBus: 0, bufferSize: 1024, format: format) { buffer, when in
  // 3
  guard 
    let channelData = buffer.floatChannelData,
    let updater = self.updater 
    else {
      return
  }

  let channelDataValue = channelData.pointee
  // 4
  let channelDataValueArray = stride(from: 0, 
                                     to: Int(buffer.frameLength),
                                     by: buffer.stride).map{ channelDataValue[$0] }
  // 5
  let rms = sqrt(channelDataValueArray.map{ $0 * $0 }.reduce(0, +) / Float(buffer.frameLength))
  // 6
  let avgPower = 20 * log10(rms)
  // 7
  let meterLevel = self.scaledPower(power: avgPower)

  DispatchQueue.main.async {
    self.volumeMeterHeight.constant = !updater.isPaused ? 
           CGFloat(min((meterLevel * self.pauseImageHeight), self.pauseImageHeight)) : 0.0
  }
}

There’s a lot going on here, so here’s the breakdown:

  1. Get the data format for the mainMixerNode‘s output.
  2. installTap(onBus: 0, bufferSize: 1024, format: format) gives you access to the audio data on the mainMixerNode‘s output bus. You request a buffer size of 1024 bytes, but the requested size isn’t guaranteed, especially if you request a buffer that’s too small or large. Apple’s documentation doesn’t specify what those limits are. The completion block receives an AVAudioPCMBuffer and a AVAudioTime as parameters. You can check buffer.frameLength to determine the actual buffer size. when provides the capture time of the buffer.
  3. buffer.floatChannelData gives you an array of pointers to each sample’s data. channelDataValue is an array of UnsafeMutablePointer<Float>
  4. Converting from an array of UnsafeMutablePointer<Float> to an array of Float makes later calculations easier. To do that, use stride(from:to:by:) to create an array of indexes into channelDataValue. Then map{ channelDataValue[$0] } to access and store the data values in channelDataValueArray.
  5. Computing the RMS involves a map/reduce/divide operation. First, the map operation squares all of the values in the array, which the reduce operation sums. Divide the sum of the squares by the buffer size, then take the square root, producing the RMS of the audio sample data in the buffer. This should be a value between 0.0 and 1.0, but there could be some edge cases where it’s a negative value.
  6. Convert the RMS to decibels (Acoustic Decibel reference). This should be a value between -160 and 0, but if rms is negative, this value would be NaN.
  7. Scale the decibels into a value suitable for your vuMeter.

Finally, add the following to disconnectVolumeTap():

engine.mainMixerNode.removeTap(onBus: 0)
volumeMeterHeight.constant = 0

AVAudioEngine allows only a single-tap per bus. It’s a good practice to remove it when not in use.

Build and run, then tap playPauseButton. The vuMeter is now active, providing average power feedback of the audio data.

Implementing Skip

Time to implement the skip forward and back buttons. skipForwardButton jumps ahead 10 seconds into the audio file, and skipBackwardButton jumps back 10 seconds.

Add the following to seek(to:):

guard 
  let audioFile = audioFile,
  let updater = updater 
  else {
    return
}

// 1
skipFrame = currentPosition + AVAudioFramePosition(time * audioSampleRate)
skipFrame = max(skipFrame, 0)
skipFrame = min(skipFrame, audioLengthSamples)
currentPosition = skipFrame

// 2
player.stop()

if currentPosition < audioLengthSamples {
  updateUI()
  needsFileScheduled = false

  // 3
  player.scheduleSegment(audioFile, 
                         startingFrame: skipFrame, 
                         frameCount: AVAudioFrameCount(audioLengthSamples - skipFrame), 
                         at: nil) { [weak self] in
    self?.needsFileScheduled = true
  }

  // 4
  if !updater.isPaused {
    player.play()
  }
}

Here's the play by play:

  1. Convert time, which is in seconds to frame position by multiplying by audioSampleRate, and add it to currentPosition. Then, make sure skipFrame is not before the start of the file and not past the end of the file.
  2. player.stop() not only stops playback, but it also clears all previously scheduled events. Call updateUI() to set the UI to the new currentPosition value.
  3. player.scheduleSegment(_:startingFrame:frameCount:at:) schedules playback starting at skipFrame position of audioFile. frameCount is the number of frames to play. You want to play to the end of file, so set it to audioLengthSamples - skipFrame. Finally, at: nil specifies to start playback immediately instead of at some time in the future.
  4. If player was playing before skip was called, then call player.play() to resume playback. updater.isPaused is convenient for determining this, because it is only true if player was previously paused.

Build and run, then tap the playPauseButton. Tap skipBackwardButton and skipForwardButton to skip forward and back. Watch as the progressBar and count labels change.

Implementing Rate Change

The last thing to implement is changing the rate of playback. Listening to podcasts at higher than 1x speeds is a popular feature these days.

In setupAudio(), replace the following:

engine.attach(player)
engine.connect(player, to: engine.mainMixerNode, format: audioFormat)

with:

engine.attach(player)
engine.attach(rateEffect)
engine.connect(player, to: rateEffect, format: audioFormat)
engine.connect(rateEffect, to: engine.mainMixerNode, format: audioFormat)

This attaches and connects rateEffect, a AVAudioUnitTimePitch node, to the audio graph. This node type is an effects node, specifically it can change the rate of playback and pitch shift the audio.

The didChangeRateValue() action handles changes to rateSlider. It computes an index into rateSliderValues array and sets rateValue, which sets rateEffect.rate. rateSlider has a value range of 0.5x to 3.0x

Build and run, then tap the playPauseButton. Adjust rateSlider to hear what Ray sounds like when he has had too much or too little coffee.

Where to Go From Here?

You can download the final project using the link at the top or bottom of this tutorial.

Look at the other effects you can add to audioSetup(). One option is to wire up a pitch shift slider to rateEffect.pitch and make Ray sound like a chipmunk. :]

To learn more about AVAudioEngine and related iOS audio topics, check out:

We hope you enjoyed this tutorial on AVAudioEngine. If you have any questions or comments, please join the forum discussion below!

Contributors

Comments