Major improvements to audio at Zencoder, and why this matters

At Zencoder, we’re obsessive about quality. It’s important to us that every video look, and sound, as good as it possibly can. And that’s why we’re very excited to announce major improvements to our audio encoding technology.

It’s easy to forget about audio when thinking about video quality, because minor quality differences are easier to see in video than in audio. But major quality differences are another story. It’s easy to tell a good audio encoder from a bad one. And sadly, most AAC audio encoders on the market are not very good.

What’s more, it turns out that users actually prefer good audio plus bad video to bad audio plus good video. Think about it. If you were video chatting with someone, or watching The Office, would you rather (a) hear words clearly but see an impaired picture, or (b) see a clear picture but not understand what people were saying?

As the research shows, audio quality is extremely important to user enjoyment of media. A study by Ralf Steinmetz shows that it is “more important to maintain a continuous (minimum jitter) audio stream than a video stream” when watching video online (see also this article). Similar studies have shown just how sensitive users are to audio/video desync – viewers are sensitive to audio/video sync errors of just a few milliseconds.

To some users, the desire for high audio quality is explicit – audiophiles spend huge amounts of money in minor, incremental improvements to audio quality. Moving from “really good” to “really really good” is worth thousands of dollars to some people.

But research shows that even if a viewer doesn’t think they can tell the difference between good and bad audio, sound quality unconsciously affects their enjoyment. In another study, viewers who watched television with high fidelity stereo audio “liked the program content significantly more and found it significantly more involving” than viewers watching the same video with low fidelity, mono audio. They may never consciously think to themselves, “This show sucks because the voices sound tinny and the music is distorted”, but that may still happen behind the scenes.

It’s 2011. Publishers and viewers should demand high quality audio along with high quality video. Sadly, though, many mainstream tools aren’t up to the challenge.

See it in action

Over the last few months, we evaluated every major AAC encoder on the market, and chose the best one. Here are some audio samples, comparing our new AAC encoder to the most commonly used open-source encoder on the market (FAAC). Put on your headphones and see if you can tell a difference.


Old: Alec Eiffel, The Pixies, 56kbps AAC-LC


New: Alec Eiffel, The Pixies, 56kbps HE-AAC

56kbps is a fairly constrained bitrate for two-channel music, so neither of these will sound perfect. And yet the difference between them is night and day. Remember that 56kbps is an important bitrate – if you’re creating streaming video for iPhones, you need to include a 64kbps audio track in addition to video tracks. After other overhead, you end up with about 56kbps left over for audio.

The AAC codec has several profiles; AAC-LC (“Low Complexity”) is the baseline profile, but HE-AAC (“High Efficiency”) is almost universally supported today, and it allows for much better sound at low bitrates. Unless you’re encoding for non-mainstream devices, there is almost no reason not to use HE-AAC today for low bitrate content.

Is HE-AAC perfect? Not quite. But it is pretty good. Good enough that if you aren’t listening to material you’re familiar with, in a quiet environment, on reference gear, you might not even notice that you’re listening to fairly low bitrate music.

But even at 56kbps AAC-LC, Zencoder’s new encoder is much better than others. Take a listen. You may notice some high frequency problems, but nothing like the old encoder.


New: Alec Eiffel, The Pixies, 56kbps AAC-LC

One step further, AAC supports an even more compressible format: HE-AAC v2. Here is the same song, this time at 16kbps (!).


New: Alec Eiffel, The Pixies, 16kbps HE-AAC v2

Zencoder’s new AAC encoder actually sounds better than a major AAC encoder at a 70% lower bitrate.

Of course, you don’t have to use a compressed bitrate. At 128kbps, FAAC sounds reasonably good. But Zencoder’s AAC encoder is still better.


Old: I’m Shipping Up to Boston, Dropkick Murphys, 128kbps AAC-LC


New: I’m Shipping Up to Boston, Dropkick Murphys, 128kbps AAC-LC

That’s not all Zencoder does, though. Beyond using the best AAC encoder technology available today, Zencoder uses advanced audio processing techniques and algorithms, so that even when comparing the same encoder, Zencoder audio often sounds better. For example, listen to these clips, both of which use the same (new) audio encoder, but using two techniques to resample 44100 Hz audio to 16000 Hz. This will clip high frequencies, so 16000 Hz won’t sound as good as 44100 Hz. But listen to Zencoder’s advanced resampling vs. standard resampling.


Standard resampling: Alec Eiffel, The Pixies, 16000 Hz, 96 kbps


Zencoder resampling: Alec Eiffel, The Pixies, 16000 Hz, 96 kbps

You may need headphones for this, but the highs sound clearer when using Zencoder, even when comparing the same encoder using the same settings.

What is happening here?

The dominant audio codec today is AAC. It’s a good audio codec, slightly more efficient than MP3, and it is used everywhere: Flash, iOS, Android, etc. It has basically displaced the MP3 codec; Zencoder sees 6x more AAC files than MP3.

But here is the thing. Many, many publishers use open source software for video and audio encoding. There is a good reason for this – most open-source video technology is excellent. Unfortunately, though, there is no good open-source encoder for AAC.

FAAC is basically the only open-source AAC encoder that is production ready. And it is not very good. This isn’t in dispute: according to the FAAC homepage, “Note that the quality of FAAC is not up to par with the currently best AAC encoders available.” The project itself has largely been abandoned; the last major update was over two years ago, and open source developers are in the early stages of developing a new AAC encoder. Philosophically, we believe in using the best technology for every piece of our infrastructure, whether that is open source, custom, or commercial. As it turns out, 90% of the time the best video and audio technology is open source (along with the best OS technology, database technology, and so on). This is just one of those rare circumstances where open source technology lags by a significant margin.

At high bitrates – 128kbps and up – FAAC is serviceable. But at lower bitrates, it’s not. This is partly because it doesn’t support HE-AAC, which is a key technology for web and mobile video delivery. But even FAAC’s AAC-LC isn’t great, as the examples above demonstrate.

But I don’t use low bitrate audio…

If you stream video or audio to iPhone or iPad, then you actually do need low bitrate audio. Apple mandates a 64kbps audio stream when using HTTP Live Streaming in apps, as a “cellular fallback”. Apps are rejected from the app store if they don’t have this. This makes sub-64kbps audio extremely important today. And even at higher bitrates, a good audio encoder is important.

If you store or deliver a lot of video, lower audio bitrates might mean major costs savings too. Using efficient HE-AAC might allow you to lower your video sizes by 10% or more. For many publishers, this could mean tens of thousands of dollars saved per year.

What this means for Zencoder customers

This change is immediate and free for Zencoder customers. You don’t have to do anything. If you’re not a customer yet, try it out or get in touch with any questions.

If you want more precise control over what AAC profile is used, though, Zencoder now has two new API settings: max_aac_profile and force_aac_profile.

max_aac_profile governs which profiles are acceptable, and chooses the right profile based on the chosen bitrate. “aac-lc” is used for high bitrate audio; “he-aac” for mid-bitrate audio; and “he-aac-v2″ for low-bitrate stereo. (HE-AAC v2 is stereo only.) “he-aac” is a safe setting most of the time; Flash, HTML5, and modern mobile devices all support HE-AAC.

force_aac_profile forces the use of one profile, regardless of bitrate. There are limits to this – you can’t force HE-AAC on 300kbps audio, for example. Use this setting if you know exactly what profile and bitrate you want to use.

Most importantly, though, what this means for Zencoder customers is that Zencoder offers the highest quality video and audio available for every major codec, and that you can safely publish high quality audio to iOS via HTTP Live Streaming, or save money by lowering bandwidth requirements.

So when you’re deciding what encoding software to use, or evaluating encoding vendors, try encoding a song to 56 kbps stereo AAC. If you care about user engagement, this test may be even more important than tests of video quality.

  • http://danielfischer.com Daniel Fischer

    This is great. Improving technology at its finest. Thanks!

  • http://www.facebook.com/davidzhao David Zhao

    Jon, is this AAC encoder developed in-house or commercially licensed?

  • Anonymous

     What a tease. No link to source code?

  • http://think256.com Eli Gundry

    How does this stack up against vorbis in terms of quality?

  • http://zencoder.com Jon Dahl

    Vorbis is great. In my experience, it’s better than FAAC and pretty close to a good AAC encoder at standard bitrates. But at low bitrates (<64kbps), HE-AAC is a bit better.

  • http://zencoder.com Jon Dahl

    Wish I could, but it’s a commercial encoder, and not one we wrote.

  • http://zencoder.com Jon Dahl

    Hey David – it’s commercially licensed. If we were writing our own (and wanted it to be good), we probably wouldn’t be making this announcement for another year or more. :)

  • http://davidweekly.org/ David E. Weekly

    You can’t share what vendor you selected? :/

  • Support

    ” What a tease. No link to source code?” Zendcoder just updated their libs from ffmpeg there is no other options to allows encoding to so many formats without coding for 5 years all the formats.

    just build the latest ffmpeg :)

  • http://brandon.arbini.com Brandon

    We do update our ffmpeg build all the time, but this post is about a commercial audio encoder, not ffmpeg.

  • http://brandon.arbini.com Brandon

    Sadly, we cannot. :|

  • https://me.yahoo.com/a/Qm7wIH55h4gd.hnkYfrKJHhLckpAgDnUvLo-#2116c Camilo Martin

    I’m curious. Have you tried an OGG Vorbis fork called AoTuv (or something like that, it’s all in japanese)? Here’s a link: http://www.geocities.jp/aoyoume/aotuv/. In my tests, it comes ahead of regular Vorbis and ahead of most other AAC encoders…

  • Frederik Dam Sunne

    They’re probably using Nero’s AAC encoder, which is excellent.

blog comments powered by Disqus