DefinitionNOTE: Make sure you are familiar with basic sound theory before reading this article.
Analogue to digital converters work by repeatedly measuring the amplitude (volume) of an incoming electrical pressure soundwave (an electrical voltage), and outputting these measurements as a long list of binary bytes. In this way, a mathematical "picture" of the shape of the wave is created.
Remember join-the-dot pictures? To produce a good image you must have sufficient dots to capture the detail of the shape AND the dots must be positioned accurately.
Quality in a join-the-dots picture depends on ...
Number of dots
Accuracy of the positioning of the dots
All image designers know that quality of an image depends on 2 factors also ...
Number of pixels per inch (ppi resolution)
Number of colours in the images palette (determined by bit depth)
Creating a good quality digital audio signal depends on 2 similar parameters.
There are 2 important parameters which control the quality of the audio conversion process. These are ...
Sample rate (number of measurements of amplitude per second). Sample rate is to audio what ppi or dpi is to images.
Bit depth (accuracy of each measurement of amplitude). Bit depth in images determines the number of possible colours a pixel can be.
What is a sample?Although it is common to use the word "sample" to refer to a complete sound (perhaps a piano note or drum break/loop), in digital theory ...
a "sample" is a single measurement of amplitude.
A sample may also be referred to as a ...
Snapshot
Sample measurement
Sample rate is simply the number of samples (or measurements of amplitude) taken per second.
Sample rate is also known as ...
Sample frequency. CD quality sample rate is expressed as "44.1KHz", meaning simply that the converter takes 44,100 measurements of amplitude per second. Sample frequency is independent of the frequency of the soundwaves being converted.
Sample bandwidth.
IMPORTANT NOTE: DO NOT confuse "Sample Frequency" with "Audio Frequency". Sample frequency is independent of the frequency of the soundwaves being converted.
Once set, sample rate does not vary during a recording, although different audio files recorded at different sample rates may be used together in a multitrack system if the software permits it.
Higher sample rates produce better quality recordings but also bigger file sizes which demand greater space on storage devices (such as hard drives), and faster processors (CPUs) to manipulate.
Lower sample rates produce poorer quality but also smaller file sizes which demand less of storage systems, CPUs and will transfer over networks (internet) faster.
Example sample ratesThe CD sample rate is 44,100 samples per second. This means that a converter will produce 44,100 measurements of amplitude for every second of sound.
DVD can use a sample rates of up to 96,000 samples per second.
MP3 files may be encoded at a variety of sample rates. The trade-off is always between quality and file size.
During his research into digital audio in the first half of the 20th century, Harry Nyquist (a scientist) produced a simple rule that should be followed to determine appropriate sample rates for differing sounds.
"The sample rate should be a little over twice the amount of the highest audio frequency (harmonic) to be recorded if poor sound quality is to be avoided".
Because humans can hear audio frequencies as high as 20KHz (20,000cps/Hz), a minimum sample rate of 44.1KHz (or 44,100 sample measurements a second) was decided upon ...
Human audio spectrum = 20Hz to 20,000Hz (20KHz) ... therefore ...
Highest audio frequency = 20,000Hz ... therefore ...
20,000 x 2 = 40,000 + "a little bit more" = 44,100 samples per second
However, it is possible to use lower sample rates and maintain good sound quality when recording sound without high frequency harmonics such as basses and kick drums.
NOTE: Increasing the sample rate above 44.1KHz does not dramatically improve the sound. Increasing bit-depth has more impact.
If the sample rate is set too low, a type of distortion called "aliasing" will be audible in the signal when it is converted back to analogue by a DAC (digital to analogue converter).
Analogue to digital converters therefore use a filter to remove any harmonics from the sound wave which are above the highest frequency that the sample rate can accommodate, before it is measured. Thus, an anti-aliasing filter in a CD recorder will remove any harmonics from the sound wave above 20KHz.
Aliasing is discussed in greater depth in the digital theory pdf
Jitter refers to irregularities in the time intervals between samples. Jitter can occur when ...
Therefore the accuracy of the digital clock, which governs when samples occur, is paramount. If the clock is not accurate, jitter will occur and the audio quality will suffer.
As an example, consider a digital signal that has been created by a (theoretically) perfect A to D converter where each sample is taken at precisely consistent intervals. If that signal is later sent through a digital system whose clock is less accurate (causing the sample intervals to fluctuate), then the correct sample amplitudes may not occur at the right places, causing audible distortion.
If the clock irregularities/timing errors are random, then so will the jitter and the resulting amplitude inaccuracies/distortion. Random distortion is noise. Because these timing errors are small and fast, they produce more amplitude distortion in the higher frequencies. The audible result is hiss.
Increasing sample rate will not always significantly improve sound quality. Increasing bit depth will result in a more obvious improvement in sound quality for most listeners.
Simply put ...
... in digital audio, bit depth determines the accuracy of each sample measurement. Better accuracy means less distortion which results in better sound!
In audio files, higher bit depth means better sound quality. In short, higher bit depths provide a converter with a more accurate "ruler" (higher bit resolution) to measure amplitude with, thereby producing more accurate measurements. In audio quality terms, more accurate measurements mean less distortion of the true shape of the soundwave.
4 bits = 16 levels
8 bits = 256 levels
16 bits = 65,536 levels
Example bit-depths ... 8-bit 8-bit sampling system In an 8-bit A-to-D converter, each measurement is recorded as an 8-bit binary byte. Between 00000000 and 11111111 there are 256 possible values. (See computer counting systems). This means that each sample measurement of amplitude will be recorded as one of these numbers.
A "ruler" with 256 divisions, or "points of resolution", is NOT very accurate. If when a measurement is taken, the amplitude of the wave does not fall exactly on one of these points, then the measurement must be rounded up or down to the next nearest point. This process is called Quantisation and results in a distorted recording of the true shape of the wave.
A measurement which has been rounded up or down is known as a quantisation error and produces quantisation distortion. At loud signal levels quantisation errors manifests themselves as noise (similar to analogue noise), but at low signal levels they can manifest themselves as unwanted audible distortion.
The effects of quantisation errors are most apparent at lower bit depths. Higher bit depths increase the quality of sound but also the quantity of data and therefore file sizes. 16 bit bytes are of course twice as big as 8 bit bytes. CD quality sound requires 5Mb of storage space for 1 mono minute (10Mb for a stereo minute).
Here are some simple rules ...
The higher the bit depth the larger the file size,the smaller the quantisation errors, the less the distortion, the better quality the sound
The lower the bit depth, the smaller the file size, the larger the quantisation errors, the more the distortion, the poorer quality the sound

CD quality is 44.1KHz / 16-bit. This means that every second a converter will produce 44,100 16 bit numbers.
44.1KHz / 24-bit recordings are higher quality than 44.1KHz / 16-bit recordings. Of course the sound files are bigger, but in general, current computer CPU power, installed RAM memory and hard disc size can handle them. Many sound engineers are now using 24-bit as standard even though the finished mixes must be converted to 16-bit prior to audio CD duplication.
It is generally agreed that a 44.1Khz / 24-bit recordings sound superior to those made at 96KHz / 16 bit.
It is important to recognise that even if a converter has a high bit-depth, setting the record level too low will result in a smaller range of bits being used and effectively reduce the bit-depth of the recording.
Setting the record level too high however, will risk digital clipping, an unpleasant distortion that is the result of all sample measurements that exceed the upper limit of the bit-depth range of the system being quantised down to the highest available value. For example, in a 16-bit system this might be 1111111111111111. Picture a mountain with its peak sliced off.
It is therefore important that recordings are made at the highest possible level without clipping, which explains why the signal is often passed through an audio limiter before it enters the A to D converter.
The process of converting a high bit depth audio signal to a lower one is most commonly referred to as truncating. Essentially some of the bits in each byte/sample are thrown away (the least significant bits to be precise).
| 24 bit byte/sample before truncating | 16 bit byte/sample after truncating |
|---|---|
| 100101110100010111100001 | 1001011101000101 (11100001 has been removed) |
The effect of this process on an audio signal is to "magnify" quantisation errors which can result in audible distortion, especially in the quieter parts of an audio signal.
Audio dithering is a process whereby low level white noise (random sound) is introduced into the signal to help randomise quantisation errors. The effect of this is to turn the audible effects of quantisation errors from unpleasant distortion into a the more acceptable analogue noise.
Dithering is most commonly used at the CD mastering stage of music production, but dither can be used for other reasons too. The following are some of the processes that involve dithering ...
You have created a 24-bit / 44.1KHz audio mix master of a recording in your DAW which needs to be converted to 16-bit in order to conform to the red book audio CD standard. During the conversion process you use a dithering algorithm to minimise the increase in distortion that will result from the "enlargement" of existing quantisation errors.
When passing a signal digitally between two devices, such as a DAW and a digital mixer, the signal may be converted and dithered if the bit-depths of the two systems don't match (the sample rate must match, otherwise a sample rate converter will need to be used).
Some effect processors allow you to set parameters for dither which will be automatically be introduced in the signal if it drops below a certain level.
Many A to D converters automatically dither as part of the sampling process, and applications and software which allow downsampling or bit-depth conversion often give the user the option to introduce dither and to control the amount of dither.
A utility to help you determine an appropriate sample rate and data size for a given audio bandwidth.