Audio Tutorial for iOS: File and Data Formats [2014 Edition]

Audrey Tam
Speaker with audio

Image credit: ilco

Before working with the iPhone, I had sadly little experience with sound formats. I knew the difference between .WAVs and .MP3s, but for the life of me I couldn’t tell you exactly what a .AAC or a .CAF was, or what the best way to convert audio files was on the Mac.

I’ve learned that if you want to develop on the iPhone, it really pays to have a basic understanding of file and data formats, conversion, recording, and which APIs to use when.

This audio tutorial is the first in a three-part Audio Tutorial series covering audio topics of interest to the iPhone developer. In this article, we’ll start by covering file and data formats.

(Jump to Part 2 or Part 3 in the Audio Tutorial series.)

File Formats and Data Formats, Oh My!

The thing to understand is that there are actually two pieces to every audio file: its file format (or audio container), and its data format (or audio encoding).

File Formats (or audio containers) describe the format of the file itself. The actual audio data inside can be encoded many different ways. For example, a CAF file is a file format, that can contain audio that is encoded in MP3, linear PCM, and many other data formats.

So let’s dig into each of these more thoroughly.

Data Formats (or Audio Encoding)

We’re actually going to start with the audio encoding rather than the file format, because the encoding is actually the most important part.

Here are the data formats supported by the iPhone, and a description of each:

  • AAC: AAC stands for “Advanced Audio Coding”, and it was designed to be the successor of MP3. As you would guess, it compresses the original sound, resulting in disk savings but lower quality. However, the loss of quality is not always noticeable depending on what you set the bit rate to (more on this later). In practice, AAC usually does better compression than MP3, especially at bit rates below 128kbit/s (again more on this later).
  • HE-AAC: HE-AAC is a superset of AAC, where the HE stands for “high efficiency.” HE-AAC is optimized for low bit rate audio such as streaming audio.
  • AMR: AMR stands for “Adaptive Multi-Rate” and is another encoding optimized for speech, featuring very low bit rates.
  • ALAC: Also known as “Apple Lossless”, this is an encoding that compresses the audio data without losing any quality. In practice, the compression is about 40-60% of the original data. The algorithm was designed so that data could be decompressed at high speeds, which is good for devices such as the iPod or iPhone.
  • iLBC: This is yet another encoding optimized for speech, good for voice over IP, and streaming audio.
  • IMA4: This is a compression format that gives you 4:1 compression on 16-bit audio files. This is an important encoding for the iPhone, the reasons of which we will discuss later.
  • linear PCM: This stands for linear pulse code modulation, and describes the technique used to convert analog sound data into a digital format. In simple terms, this just means uncompressed data. Since the data is uncompressed, it is the fastest to play and is the preferred encoding for audio on the iPhone when space is not an issue.
  • μ-law and a-law: As I understand it, these are alternate encodings to convert analog data into digital format, but are more optimized for speech than linear PCM.
  • MP3: And of course the format we all know and love, MP3. MP3 is still a very popular format after all of these years, and is supported by the iPhone.

For more information about these types see Apple’s Using Audio.

So which do I use?

That looks like a big list, but there are actually just a few that are the preferred encodings to use. To know which to use, you have to first keep this in mind:

  • You can play linear PCM, IMA4, and a few other formats that are uncompressed or simply compressed quite quickly and simultaneously with no issues.
  • For more advanced compression methods such as AAC, MP3, and ALAC, the iPhone does have hardware support to decompress the data quickly – but the problem is it can only handle one file at a time. Therefore, if you play more than one of these encodings at a time, they will be decompressed in software, which is slow.

So to pick your data format, here are a couple of rules that generally apply:

  • If space is not an issue, just encode everything with linear PCM. Not only is this the fastest way for your audio to play, but you can play multiple sounds simultaneously without running into any CPU resource issues.
  • If space is an issue, most likely you’ll want to use AAC encoding for your background music and IMA4 encoding for your sound effects.

The Many Variants of Linear PCM

One final and important note about linear PCM encoding, which again is the preferred uncompressed data format for the iPhone. There are several variants of linear PCM depending on how the data is stored. The data can be stored in big or little endian formats, as floats or integers, and in varying bit-widths.

The most important thing to know here is the preferred variant of linear PCM on the iPhone is little-endian integer 16-bit, or LEI16 for short. Note that this differs from the preferred variant on the Mac OSX, which is native-endian floating point 32-bit. Because audio files are often created on the Mac, it’s a good idea to examine the files and convert them to the preferred format for the iPhone.

File Formats (or Audio Containers)

The iPhone supports many file formats including MPEG-1 (.mp3), MPEG-2 ADTS (.aac), AIFF, CAF, and WAVE. But the most important thing to know here is that usually you’ll just want to use CAF, because it can contain any encoding supported on the iPhone, and it is the preferred file format on the iPhone.

Bit Rates

There’s an important piece of terminology related to audio encoding that we need to mention next: bit rates.

The bit rate of an audio file is the number of bits that are processed per unit of time, usually expressed as bits per second (bit/s) or kilobits per second (kbit/s). Higher bit rates produce larger files. Some encodings such as AAC or MP3 let you specify the bit rate to use when compressing the audio file. When you lower the bit rate, you lose quality as well. Unlike other computer-related units, 1 kbit/s is actually 1000 bit/s, not 1024 bit/s.

You should choose a bit rate based on your particular sound file – try it out at different bit rates and see where the best match between file size and quality is. If your file is mostly speech, you can probably get away with a lower bit rate.

Here’s a table that gives an overview of the most common bit rates:

  • 32kbit/s: AM Radio quality
  • 48kbit/s: Common rate for long speech podcasts
  • 64kbit/s: Common rate for normal-length speech podcasts
  • 96kbit/s: FM Radio quality
  • 128kbit/s: Most common bit rate for MP3 music
  • 160kbit/s: Musicians or sensitive listeners prefer this to 128kbit/s
  • 192kbit/s: Digital radio broadcasting quality
  • 320kbit/s: Virtually indistinguishable from CDs
  • 500kbit/s-1,411kbit/s: Lossless audio encoding such as linear PCM

Sample Rates

There’s one final piece of terminology to cover before we move on: sample rates.

When converting an analog signal to digital format, the sample rate is how often the sound wave is sampled to make a digital signal.

Almost always, 44,100Hz is used because that is the same rate for CD audio.

What’s Next?

Next up in the Audio Tutorial series I talk about converting audio files and recording audio files on the Mac.

Audrey Tam

Audrey Tam retired at the end of 2012 from a 25-year career as a computer science academic. Her teaching included Pascal, C/C++, Java, Java web services, web app development in php and mysql, user interface design and evaluation, and iOS programming. Before moving to Australia, she worked on Fortran and PL/1 simulation software at IBM's development lab in Silicon Valley. Audrey now teaches short courses in iOS app development to non-programmers, and organizes venues for Melbourne Cocoaheads monthly meetings.

User Comments

12 Comments

  • Hi Ray, can we stream an audio format other than mp3 ? (like wav, caf or any other) ?
    asifali
  • asifali wrote:Hi Ray, can we stream an audio format other than mp3 ? (like wav, caf or any other) ?


    How do you stream an mp3?
    regularberry
  • Table of bitrates is confusing.
    It's right for MP3, but AAC gives a much higher quality of sound for same bitrates in compare with MP3.
    shmidt
  • how can i save application music ...in music library...
    sambit
  • Hello,
    Is there way to play Midi file? I have .kar file (File type is MIDI).

    can we play midi files in iPhone?

    Thank you
    Nitesh6D
  • Ray- thanks so much for all this information. I'm so glad I found your site. Just a quick one, what db level should I make my audio files to ensure they can be heard on the iphone?

    Thanks again!
    scruffykells
  • scruffykells wrote:what db level should I make my audio files to ensure they can be heard on the iphone?

    As loud as possible. Which is actually 0 db. :-) If you look at your audio files in an audio editor, try to get the signal all the way to the top and the bottom of the waveform, without unnecessarily distorting the sound of course.
    Hollance
  • Alright, thanks so much for your advice.
    scruffykells
  • Here's a nice audio converter for core audio formats. http://gngrwzrd.com/microac/
    gngrwzrd
  • Should it read, "HE-AAC is a subset of AAC"? If not, the difference is not clear to me.
    Jessy C. Rabbit
  • I have read you old audio tutorials, so I was very happy when I saw updates, becaus I had to change come code in old tutorials to make it work.
    In future, when you make update of old tutorials, I would like to see first some section with header "what is new", so that I can quickly see what are update in text.
    In current format I need to read all article and I still do not know is there any difference with old article.
    WebOrCode
  • @Jessy C. Rabbit: HE-AAC is a superset of AAC -- see the profile diagram in http://en.wikipedia.org/wiki/High-Effic ... dio_Coding. A very helpful synopsis is in the top-ranked reply at http://www.reddit.com/r/audiophile/comm ... ch_should/

    @WebOrCode: good suggestion! specifically, for these three tutorials, I tweaked the first two (bit rates, screen shots), and the main changes are in the third, especially the sample project and dead links -- iOS7 deprecated AudioSession in favor of AVAudioSession.

    I just noticed that the cross-links are still going to the 2010 versions -- will get that fixed asap
    Audrey

Other Items of Interest

Ray's Monthly Newsletter

Sign up to receive a monthly newsletter with my favorite dev links, and receive a free epic-length tutorial as a bonus!

Advertise with Us!

Hang Out With Us!

Every month, we have a free live Tech Talk - come hang out with us!


Coming up in September: iOS 8 App Extensions!

Sign Up - September

RWDevCon Conference?

We are considering having an official raywenderlich.com conference called RWDevCon in DC in early 2015.

The conference would be focused on high quality Swift/iOS 8 technical content, and connecting as a community.

Would this be something you'd be interested in?

    Loading ... Loading ...

Our Books

Our Team

Tutorial Team

  • Sam Davies

... 49 total!

Update Team

  • Ray Fix

Editorial Team

  • Ryan Nystrom

... 23 total!

Code Team

  • Orta Therox

... 1 total!

Translation Team

  • Jose De La Roca
  • David Hidalgo
  • Cosmin Pupaza

... 33 total!

Subject Matter Experts

  • Richard Casey

... 4 total!