Wednesday, November 16, 2011

Sound Wars: What’s the Real Story?

Okay, so we know loud music’s cool. Not too loud, of course – there’s no point in putting your eardrums into retirement before your dancing shoes – but how is ‘loud’ actually defined? And is ‘loud’ the only thing we should be concentrating on?
Let’s see if we can wade through the minefield of technical data and nail down some simple, practical advice for how to turn things up properly, the implications of using quieter equipment, and how that gain knob affects things in real life; rather than simply believing me, I’ve also roped in the very kind assistance of some of the best brains in the game – and from different teams too. Dylan Wood, the head of R&D at Serato Audio Research R&D, and TJ Hertz, DSP developer at Native Instruments, have kindly given a few pearls of wisdom to help us make sense of it all.


Computer based digital audio runs, almost without exception, in a ‘floating point’ environment. The ‘point’ in floating and fixed point refers to the decimal, and compared to fixed point calculations whose values are limited by the position of the decimal point remaining in the same place (so in the length of the number, which is limited by the number of bits the system operates on, the decimal point is always the same amount of digits along) across all results, the point in floating point can be placed anywhere. This has implications for the accuracy of how digital audio can be represented. TJ Hertz gave me a helpful visual example of fixed point audio:
“Think of the signals that a (fixed point) digital mixer can represent as a blank canvas divided into pixels. Let’s take an extremely lo-fi case and say that the canvas is 1024 pixels wide and tall, and that’s a direct analogy of a 10-bit fixed point system – I’m using unrealistically small numbers to illustrate a point.
If you paint a portrait of a face on this canvas, you want it to be a sensible size – say 700 pixels wide. If you paint the face too big, then the edges literally get clipped off, and that’s exactly what happens to an audio signal that’s louder than the mixer can represent. If you paint the portrait too small, say 30 pixels wide, then it starts becoming pixellated – that’s analogous to the quantization distortion that you get if you try and represent a very quiet audio signal with a fixed-point digital system.”
So, there’s a trade off between accuracy of data and the integrity of the data itself: if your ‘portrait’ is too big, something’s got to give. Compare that to floating point audio:
“A floating point signal is like painting a picture on 75% of the canvas, and then zooming or shrinking the canvas to the desired size. Therefore, if you want a tiny painting, you can still use all 4,000,000,000 pixels [at 32 bit], they’re just finer pixels. You don’t lose resolution.
It’s much easier to write algorithms in floating point because the precision is always the same, regardless of the signal level. This means that the risk of a filter or EQ going unstable is much lower, you get fewer distortion and quantization effects (for simply passing audio through, the effect is small anyway, but it becomes more important when you start doing maths), and, conveniently, you can port algorithms directly from native computer code without worrying too much about whether they will still behave the same way. The tradeoff, of course, is cost.”


So, floating point is the way to go for software calculations, and digital hardware is gradually starting to get in on the action too. The audio remains pristine and perfect for as long as possible, until the DAC stage.
When converting from digital to analogue – and thus dealing with tangible things rather than mathematical perfection – the data has to become fixed point. Almost all digital mixers are fixed point and of course coming out of your audio interface down an analogue cable will require a DAC, and it’s here that those sacrifices have to be made – to an extent. TJ tells me:
Let’s have a quick reality check. In a modern-day digital mixer, you’re looking at about 32 bits, not 10 bits. This means that the canvas is now 4,000,000,000 pixels wide. CD is 16 bits, or 65536 pixels wide. In other words, you can attenuate from full scale down by a factor of 16 bits = -96dB before you get to CD quality, which would be so quiet that it’s basically approaching the noise floor anyway. So, in terms of the signal that you can actually represent, this hypothetical 32-bit fixed point mixer really has no trouble covering the required dynamic range.”
So when it comes to dynamic range,  as long as we have a decent DAC we’re okay from this end – especially if we use the 16bit headroom that CD provides as a baseline.


What is it that makes things loud, though? Power. The thing is, there isn’t really a standard for how loud things are actually supposed to be – in the DJ world, at least. Consider the standard’s forever been in flux, with not only different outputs on turntable cartridges but also an infinitely different level coming out of each record, and you begin to realise both how much easier you have it as a digital DJ (and by ‘it’ I mean adjusting mix levels) and how important the DJ mixer’s job has traditionally been when it comes to gain.
Dylan Wood clears up the loudness issue here:
“The maximum RMS voltage at the output is what you get if the DAC is outputting a full scale signal, this is basically how “hot” the device is.  The signal comes from the DAC and is then amplified to achieve the desired output voltage. There is no widely adhered to standard for this output voltage.  Traditional consumer terms and labels on hardware are things like +4dBu or -10dBu.  For example if you were to equate these to voltages +4dBu = 1.22 volts RMS, -10dBu = somewhere in the order of 100′s of millivolts.
Most modern DJ CD players though output around 2 volts RMS though so the “standards” that people talk about just really aren’t used.  Most modern interfaces and mixers and designed to output somewhere between 2 and 4 volts RMS.  Most modern digital mixers handle input voltage of around 4 volts.”
As you can see, there’s no standard output level for audio devices – but that in itself isn’t really a problem. Dylan outlines a side effect, though:
Companies keep building louder gadgets to out do each other so if you build something that is to professional audio standards DJ’s think it’s too quiet because they have to adjust the gain on their mixer.  It’s not dissimilar to the “loudness wars” that people talk about with audio mastering. In some ways devices being too loud can be worse because if the output gain is digital then you are actually losing quality when you turn it down.”
And that’s corroborated by TJ, who raises one more point:
“The incremental increases that we’re seeing in the output level of prosumer-level DJ soundcards (as opposed to studio converters worth thousands) is more or less down to marketing. A ‘player’ device – like a CDJ or soundcard – has no need to output a signal level hotter than anyone else, because it’s being plugged into a mixer with a gain control. It gets more complicated if you’re mixing “in the box” since then the soundcard has to somehow pretend to be a mixer and therefore ‘fake’ the kind of headroom you get with a real, 240v-powered box – namely, 15 to 25 dBu – but that’s another kettle of fish. Lack of headroom is simply one of the inherent issues with mixing inside a computer.”
The more power a device can kick out, the stronger its signal to noise ratio can be (but, the more distortion is introduced). As Dylan says above, amplification is done after the DAC stage, so the fidelity of the audio isn’t intrinsically linked to the amount of volts a device pushes out, but as its decibel level is essentially a scale upon which the limits of its dynamic range can be pinned upon, when it comes to a mixer, power is important. After all, a professional DJ mixer may well have a 240v power supply – a 5 volt USB connection just won’t be able to keep up.


If we’ve debunked any notion of ‘standards’ when it comes to levels, exactly what is the mysterious 0 on the level meter that we’re told to stick to in DJing 101? Well, if 0dB were to equal the full scale for decibel output, anything above 0 would clip, as the device would be unable to accuractely transcribe the audio that landed outside its boundaries. In reality, manufacturers build in a safety zone of headroom to allow for normal usage at just outside the recommended parameters, and the better quality the mixer (and more power it uses) the more it can give this safety zone without affecting normal operating limits. TJ explains:
“Analog mixers might start to soft-saturate before they hit the top of the meter, but only quite subtly. Digital mixers will typically not noticeably distort until they hit their rated maximum, which could even be past the end of the meter (actually, in the case of the DJM800 I think there’s something like two more extra red LEDs of headroom, more on the DJM900)”
However, we end up almost coming full circle as the next thing we need to be going into, perhaps a club PA system, needs to be able to handle that loud an input. There’s no point pushing individual channels so hard you can’t properly use the level meters if you’re going to squeeze the master output, so it’s best to use that 0 as your guide.


The very top flight of mixers, such as the Rane Sixty Eight (actually at this point, maybe only the Sixty Eight), are starting to incorporate floating point audio into their digital innards. This means that there’s not only all the advantages of floating point at the post-ADC stages of the signal chain, including the ability for clipping to be avoided at no cost at any stage of the mixer’s signal chain, but potentially audio can be sent via digital connection so that the only conversion that happens at all is the master channel’s DAC stage. As for what next? Dylan speculates…