1 February 2010

Audio 101 for iPhone Developers: File and Data Formats

 

Speaker with audio

Image credit: ilco

Before working with the iPhone, I had sadly little experience with sound formats. I knew the difference between .WAVs and .MP3s, but for the life of me I couldn’t tell you exactly what a .AAC or a .CAF was, or what the best way to convert audio files was on the Mac.

I’ve learned that if you want to develop on the iPhone, it really pays to have a basic understanding of file and data formats, conversion, recording, and which APIs to use when.

This article is the first in a three-part series covering audio topics of interest to the iPhone developer. In this article, we’ll start by covering file and data formats.

(Jump to Part 2 or Part 3 in the series.)

File Formats and Data Formats, Oh My!

The thing to understand is that there are actually two pieces to every audio file: its file format (or audio container), and its data format (or audio encoding).

File Formats (or audio containers) describe the format of the file itself. The actual audio data inside can be encoded many different ways. For example, a CAF file is a file format, that can contain audio that is encoded in MP3, linear PCM, and many other data formats.

So let’s dig into each of these more thoroughly.

Data Formats (or Audio Encoding)

We’re actually going to start with the audio encoding rather than the file format, because the encoding is actually the most important part.

Here are the data formats supported by the iPhone and a description of each:

  • AAC: AAC stands for “Advanced Audio Coding”, and it was designed to be the successor of MP3. As you would guess, it compresses the original sound, resulting disk savings but lower quality. However, the loss of quality is not always noticeable depending on what you set the bit rate to (more on this later). In practice, AAC usually does better compression than MP3, especially at bit rates below 128kbit/s (again more on this later).
  • HE-AAC: HE-AAC is a superset of AAC, where the HE stands for “high efficiency.” HE-AAC is optimized for low bit rate audio such as streaming audio.
  • AMR: AMR stands for “Adaptive Multi-Rate” and is another encoding optimized for speech, featuring very low bit rates.
  • ALAC: Also known as “Apple Lossless”, this is an encoding that compresses the audio data without losing any quality. In practice, the compression is about 40-60% of the original data. The algorithm was designed so that data could be decompressed at high speeds, which is good for devices such as the iPod or iPhone.
  • iLBC: This is yet another encoding optimized for speech, good for voice over IP and streaming audio.
  • IMA4: This is a compression format that gives you 4:1 compression on 16-bit audio files. This is an important encoding for the iPhone, the reasons of which we will discuss later.
  • linear PCM: This stands for linear pulse code modulation, and describes the technique used to convert analog sound data into a digital format. In simple terms, this just means uncompressed data. Since the data is uncompressed, it is the fastest to play and is the preferred encoding for audio on the iPhone when space is not an issue.
  • μ-law and a-law: As I understand it, these are alternate encodings to convert analog data into digital format, but are more optimized for speech than linear PCM.
  • MP3: And of course the format we all know and love, MP3. MP3 is still a very popular format after all of these years, and is supported by the iPhone.

So which do I use?

That looks like a big list, but there are actually just a few that are the preferred encodings to use. To know which to use, you have to first keep this in mind:

  • You can play linear PCM, IMA4, and a few other formats that are uncompressed or simply compressed quite quickly and simultaneously with no issues.
  • For more advanced compression methods such as AAC, MP3, and ALAC, the iPhone does have hardware support to decompress the data quickly – but the problem is it can only handle one file at a time. Therefore, if you play more than one of these encodings at a time, they will be decompressed in software, which is slow.

So to pick your data format, here are a couple rules that generally apply:

  • If space is not an issue, just encode everything with linear PCM. Not only is this the fastest way for your audio to play, but you can play multiple sounds simultaneously without running into any CPU resource issues.
  • If space is an issue, most likely you’ll want to use AAC encoding for your background music and IMA4 encoding for your sound effects.

The Many Variants of Linear PCM

One final and important note about linear PCM encoding, which again is the preferred uncompressed data format for the iPhone. There are several variants of linear PCM depending on how the data is stored. The data can be stored in big or little endian formats, as floats or integers, and in varying bit-widths.

The most important thing to know here is the preferred variant of linear PCM on the iPhone is little-endian integer 16-bit, or LEI16 for short. Note that this differs from the preferred variant on the Mac OSX, which is native-endian floating point 32-bit. Because audio files are often created on the Mac, it’s a good idea to examine the files and convert them to the preferred format for the iPhone.

File Formats (or Audio Containers)

The iPhone supports many file formats including MPEG-1 (.mp3), MPEG-2 ADTS (.aac), AIFF, CAF, and WAVE. But the most important thing to know here is that usually you’ll just want to use CAF, because it can contain any encoding supported on the iPhone, and it is the preferred file format on the iPhone.

Bit Rates

There’s an important piece of terminology related to audio encoding that we need to mention next: bit rates.

The bit rate is the number of bytes per second that an audio file takes up. Some encodings such as AAC or MP3 let you specify the number of bytes to compress the audio file to. When you lower the bytes per second, you lose quality as well.

You should choose a bit rate based on your particular sound file – try it out at different bit rates and see where the best match between file size and quality is. If your file is mostly speech, you can probably get away with a lower bit rate.

Here’s a table that gives an overview of the most common bit rates:

  • 32kbit/s: AM Radio quality
  • 48kbit/s: Common rate for long speech podcasts
  • 64kbit/s: Common rate for normal-length speech podcasts
  • 96kbit/s: FM Radio quality
  • 128kbit/s: Most common bit rate for MP3 music
  • 160kbit/s: Musicians or sensitive listeners prefer this from 128kbit/s
  • 192kbit/s: Digital radio broadcasting quality
  • 320kbit/s: Virtually indistinguishable from CDs
  • 500kbit/s-1,411kbit/s: Lossless audio encoding such as linear PCM

Sample Rates

There’s one final piece of terminology to cover before we move on: sample rates.

When converting an analog signal to digital format, the sample rate is how often the sound wave is sampled to make a digital signal.

Almost always, 44,100Hz is used because that is the same rate for CD audio.

What’s Next?

Next up in the series I talk about converting audio files and recording audio files on the Mac.


Category: iPhone

Tags: ,

23 Comments

  1. Luke (1 comments) says:

    Hi,

    Do you have any experience with libMMS / reading radio streams in MMS format? I’m thinking of building a very simple radio app as a learning project but MMS streams are not natively supported.

    Some pointers on how to compile / include /interact with libMMS in an iPhone project would be gratefully received. I think the required code could be here:

    http://www.wunderradio.com/code.html

    thanks!

  2. Ray Wenderlich (874 comments) says:

    Hm, unfortunately I have never played around with that, but it does sound interesting, thanks for letting me know about it! If I ever do play around with it I’ll put up a blog post about it.

  3. Bindu (1 comments) says:

    Hi – can AMR encoding be stored in MP3 file formats? In general, does iPhone provide AM encoding support to record audio? Does it provide AMR playback support, i.e. can you playback AMR encoded audio? If so, what file format can AMR encoded audio be provided in for iPhone?

    Thanks in advance.

    Bindu

  4. Ray Wenderlich (874 comments) says:

    @Bindu: If you run afconvert -hf, you’ll see the list of file formats supported by the iPhone, and the data formats each can contain. The “samr” data format can only be contained by the “amrf” or “caff” file formats.

    I haven’t played around with recording, so won’t be able to help out there. Best of luck!

  5. Nadav (5 comments) says:

    Hi ,I was wondering about playing caf files in flash over a web browser , is it possible with certain codecs? Perhaps by just changing the format name ?

    Can see why apple chose not to include mo3 encoding in to it’s API sdk ….

    Can u suggest a way to go about it ?

  6. Nadav (5 comments) says:

    Sorry typo, meant , I can’t see why Apple chose not to record with mp3 ….

  7. Ray Wenderlich (874 comments) says:

    @Nadav: CAF is just a file format, inside can be different data formats including MP3. You should be able to use the afconvert command line utility to convert a CAF file into a different format that your framework can use.

  8. nadav (5 comments) says:

    Hi Ray , thanks for your answer, thats good news,
    is afconvert a utility that can be used within an iphone app ?

  9. Ray Wenderlich (874 comments) says:

    @nadav: Afconvert is a command line utility – see the following tutorial for info on how to use it:

    http://www.raywenderlich.com/233/audio-101-for-iphone-developers-converting-and-recording

  10. Nadav (5 comments) says:

    Thanks , I’m trying to record something on iPhone and play it in a web browser using flash
    any ideas how to do that. ?

  11. Ray Wenderlich (874 comments) says:

    Flash isn’t supported on the iPhone – you know that right?

    So are you just looking to record a sound yourself and then move it to a web page manually (simple), or make an app that allows a user to record arbitrary sounds and upload them (more complex?)

    Assuming the first case, you could:

    1) Use the built-in Voice Memos app to record your sound
    2) Email it to yourself with share
    3) Use afconvert to convert it to a format usable in flash
    4) Upload it to wherever your flash files are and use away!

    Hope this helps!

  12. nadav (5 comments) says:

    of course i know flash isnt supported on iphone
    it is however still a usefull media format(even though apple doesnt like it ;) ) .

    what im trying to do is take a sound file recorded on my App.(not through voice memo or any 3rd party) ,
    transmit it to a browser using a flash player that reads mp3′s . the transmitting part is already working for me , the problem is i can only record in caf format or aiff, and no flash player i know can play these files..
    could i save an audio file as MOV ( for instance)
    perhaps there is a way around it by changing format name and using a specific encoding system . (mp3, aac, this it the reason im asking .)
    i thought you might know..

    thanks again .

  13. Ray Wenderlich (874 comments) says:

    @nadav: Yeah, AFAIK you’ll have to upload the audio to your server and then convert it server-side. I’m not sure if you can simply copy the afconvert utility to your server and run it via a server-side script to do the conversion, or if you’ll need to find another library/utility to do the conversion. Let me know if you get it working tho!

  14. satheesan.op (6 comments) says:

    Hi thanks a million,
    I have a question that is it possible to embed an audio file into my video…I am talking about editing a video..Can you give some information regards to this..?
    Your tutorials are really helpful and amazing.Thanks again.

  15. Ray Wenderlich (874 comments) says:

    @santheesan: I have not played around with video editing/etc. on iOS much yet, sorry!

  16. satheesan.op (6 comments) says:

    Thanks for your reply ……

  17. Ramkumar (7 comments) says:

    @Ray : hi Ray, can we convert array of images into a video file with output format as mp4

  18. Ray Wenderlich (874 comments) says:

    @Ramkumar: No idea how to do this, like I said haven’t done much with video yet, sorry!

  19. SAQIB IRSHAD (1 comments) says:

    hi Ray, what should i have to do to save mp3 streaming in iphone CAF file.

  20. Jeff and Chase (1 comments) says:

    Hey Ray, my 11 year old son wants to create a Fart app. Yah I know, like thousands don’t exist already, but he wants to make one. I am not a techie guy by any means. Is there an ‘App writing for Dummies’ site to go to?

  21. Ray Wenderlich (874 comments) says:

    @Saquib: Sorry no experience with this.

    @Jeff and Chase: That’s great! Well, I do have a tutorial series on How to Make a Simple iPhone App that might help:

    http://www.raywenderlich.com/1797/how-to-create-a-simple-iphone-app-tutorial-part-1

    Best of luck!

  22. Chris (7 comments) says:

    Hi Ray, first of all: thank you very much for all these great tutorials, they helped me already a lot! :)

    I am programming a small app for a music band where the users should be able to save an mp3 file to their “iPod app”. I just saw your post about saving files to the file-system but is it possible to save audio directly into the music library?

    It would be great if I could include that file directly into my project but as alternative a simple download would be enough (I just don’t know how to do that). Thank you!

  23. Ray Wenderlich (874 comments) says:

    @Chris: I took a quick glance at the iPod Library Access Programming Guide here:

    http://developer.apple.com/library/ios/#documentation/Audio/Conceptual/iPodLibraryAccess_Guide/Introduction/Introduction.html

    From what I can tell, looks like you can’t add files to the library programmatically unfortunately. If you find out differently let me know!

I'd love to hear your thoughts!