|
If you are reading this, I assume that
you are interested in learning how digital recording works.
This is a good thing, considering where the current
recording technology is going. Analog is slowly becoming the
way of the past, especially where home recording is
concerned. That's not to say that digital recording is
better than analog, but it does have some amazing advantages
over using analog tape.
. Things like better
signal-to-noise ratios and editing capabilities make it
easier to create great recordings provided you have decent
songs. Nowadays, you can even get away with having
less-than-decent musicians!
But along with digital recording
comes a whole new lingo that you must understand, such as
sample rate, bit depths, and A/D/A converters. So, the
purpose of this article is to get you familiar with how
digital recording works so you know how to optimize your
recordings. Lets start out by explaining some basic terms.
All audio signals going into a
digital recorder must go through the recorders A/D
converter, which means analog-to-digital converter. It is
this converter that turns the audio signal into a binary
digital language consisting of ones and zeros. Each one or
zero is called a bit, and information is stored in packets
no larger than a byte, which is 8 bits grouped together.
Imagine a sine wave being run into
the recorder. The converters work by taking measurements of
the voltages of the incoming signal at specific intervals,
and the voltages represent the summed frequencies of the
signal at that time. It takes these measurements at a
specific rate, such as 44100 samples per second. This is
what is called the sample rate. It is how many samples are
taken of the waveform voltages within any given second.
There are many different sample rates but the most common
ones are 44.1K and 48K, although 96K is also becoming
popular. This value is expressed as frequency, meaning how
often a sample is taken. Of course, the higher the sample
rate, the more accurate the recording of the original
waveform because more samples are used to represent it.
Another important term is bit depth
(or quantization value). This is a measurement of the volume
of that sampled frequency voltage. Bit depths vary also but
the most common are 16 and 24 bits. Basically, in a 16 bit
system, 216 bits (65536 different values) are
used to represent the entire dynamic range of the song and
each sample falls on a value somewhere between 1 and 65536.
At the point the sample is taken, the sample will fall on or
near a certain bit and it is that bit that will represent
the volume of the sample. It is easier to understand this
whole concept if you imagine an X/Y graph, with sample times
running on the horizontal axis, and bit values running on the
vertical axis. Each time a sample is taken, the waveform has
a specific frequency and volume, and that is what is being
recorded in the form of bits. So the next time you hear
44.1/ 24 bit, you will know what those terms mean. CDs are
encoded digitally at 16 bit, so that is a pretty common bit
depth to record at, although 24 bit is extremely
popular too. Keep in mind though, that higher bit depth and
sample rates result in larger files that take up more space
on your hard drive. This may be an issue to you when you start
recording. Also as a side note, you can figure out the
dynamic range of your recordings by multiplying the bit rate
by 6db. For example, a 16 bit quantization rate has a 96db
dynamic range.
You may hear the term quantization
error when reading about bit depth. Quantization error is
the difference between the quantization value that is
recorded and the actual voltage value of the signal. The
point at which the sample is taken may not exactly land
right on a specific quantization value, and so the closest
value will be used. If the bit depth is too small, there will
be a larger distance between the possible bits used to
represent the volume. The bit used to represent the volume
may be wrong because it was the closest bit to the original
signal, and this results in added noise to your recording.
Therefore, it is important to use a high bit depth to reduce
this signal-to-noise ratio that can create more noise. In
light of this, no digital system can accurately represent an
analog signal because, while an analog signal has a constant
frequency and volume, a digital system would have to take an
infinity’s worth of samples to record both, and that is
just not possible. Thankfully, digital systems can take
enough samples over the course of one second to not sound
choppy. Depending on the brand of converters, this can fool
our ears into thinking there isn’t much difference between
a true analog signal vs. a digital one. However, really low
sample rates and bit depths sound very lo-fi, so try to use at
least 44.1K/16bit rates unless you are going for an effect
or don’t really care about the quality.
Another good piece of information
to know is the Nyquest Theorem. This is a rule that states
your sampling rate should be at least 2 times the highest
frequency you are trying to record. Therefore, if you are
playing frequencies up to 16KHz, then your sample rate must
be at least 32,000 samples per second. Since our hearing
range only reaches up to 20KHz, using a sample rate of 44.1K
is acceptable because it allows us to record signals up to a
22.05KHz frequency. The reason this is important is because
the highest frequencies have such small wavelengths that
they need a fast sample rate to catch them. Recording at
44.1K will allow for two samples to be taken for every 20KHz
wavelength.
The result of having a frequency
higher than the sample rate can catch is called Aliasing.
Because a high frequency has such a short waveform, its peak
may fall between the samples, making the sample represent a
lower frequency than is actually present. To prevent this,
digital recorders have an anti-aliasing filter to get rid of
frequencies too high for the sample rate to accurately
catch. If the sample rate is 44.1K, then any frequencies
above 22.05KHz will be taken out.
At this point, I should explain the
signal flow of digital recorders so you know what is going
on in there. The audio signal first enters the A/D converter
and goes through a low pass filter (anti-aliasing filter) to
take out the high frequencies. Then, it goes to the
converter, which samples the signal. Actually, it over
samples the signal to make sure the whole thing is
accurately represented. This is why you sometimes hear the
term "128X Oversampling". It just means that it is
taking 128 times more samples than the rate of 44.1K, or
whatever the actual sample rate is. Then these extra samples
get filtered out after conversion to only store 44100
samples per second. When these recordings are played back,
they have to go through a digital-to-analog conversion to be
heard. This begins with data going to the converter, which
again oversamples and creates samples to represent missing
values between the actual stored ones. At this point, we now
have samples numbering 128X the sample rate. Next, these
extra samples need to be filtered out at the anti-imaging
filter. That way, only the actual voltages that are stored
are created and played back.
Now I want to cover the concept of
dither. You may hear this a lot and many people don’t know
what it really means. When a signal is decreasing in volume,
the amount of bits representing the signal will get smaller
until it isn’t represented anymore. With a 16 bit system,
this occurs at the top of the noise floor and can often
cause a level jump from audible to nothing automatically
instead of being smooth. This is because the volume
difference between bits may not be enough to capture
low-level signals. Dither adds a bit of white noise to the
signal to allow it to be represented down to silence, so
that we don’t hear the jump from noise to silence. If the
original signal falls below the lowest bit that can
represent its volume, then the white noise bumps it up
enough so it represented again. You don’t need dither in a
24 bit system since more bits are located within the noise
floor, but if you are converting a file to 16 bit from a
sample rate that was higher, then you need to have some sort
of dither included in the conversion. Hopefully your program
will do that for you.
The last thing that we should cover
is Word Clock. You probably wont use this at home as much as
professional studios do but it is still nice to be familiar
with it. Word clock is a synchronization signal sent out
from a master digital unit to another digital unit that
aligns the second machines samples with the first. This
means that the distance between each sample during the whole
recording process will remain constant between the two
machines and won’t drift. One example where you would use
this is with the Protools system, having two or more 888
digital converters. Each has 8 in and outputs so if you need
to record more than 8 tracks at a time, you would want both
units sampling their signals at exactly the same time.
Another example is if you don’t like the sound of your
converters, you can buy a clocking unit that is used only to
be a master clock for all your digital gear. It outputs a
steady clock rhythm and can really clean up the sound of
your recordings. These units may also output superclock,
which is 256 times faster than word clock, and is used for
keeping recording software for your computers sampling at
the right rate.
Hopefully, I have explained all
this in a way you can understand it. It’s a lot to take in
and worth a second or third read. But I think once you know
how it all works in theory, you can understand how to use
these concepts to your advantage, like the importance of
using good converters and what sample and bit rates to use
for your projects. Take care and good luck in the new year! |