Let’s get down and dirty and into the nitty-gritty, and break down the difference when it comes to 16-bit vs. 24-bit sound quality.
Still remember 12-bit sampling on Akai MPC 60? It was once everyone’s go-to production trick — the magic sauce that had your monitors punching out that gritty drum sound. Every hip hop producer loved it to bits!
But gone are the days when samplers when actual hardware units sat in your studio rack or desk. Today, we have 24-bits available even in streaming audio. Want to know what exactly is bit depth and why you should care about it?
Bit depth refers to the number of bits you have to capture audio content in a single moment of time. Learn more about it, as well as the differences between 16-bit and 24-bit, right below.
Bit Depth Guide for Dummies: Breaking Down the Bits
Fun fact: Analog source signal (straight curve) is converted to digital representation that can then be processed into the digital domain.
Your audio interface (e.g., portable recorder, sound card, hardware sampler, etc.) uses a special analog-to-digital converter (ADC) to convert analog electrical signals such as voltage or current into digital bitstream.
This audio interface can be processed and stored as binary numbers.
When we want captured bit audio data, we need to send the information to a digital-to-analog converter (DAC). It then converts the bitstream back to voltage or current.
It’s important to understand now that sampling (or digital capturing) is essentially a process that creates a digital ‘snapshot’ of an analog signal (that’s constantly varying). It is based on two main settings:
- Speed of sampling (sample rate which is expressed as a number of samples taken during 1 second)
- Range of sampling (bit depth)
What these actually mean is that in practice, digital recording can never be an exact representation of the original signal but rather a “sample” of it. And that’s why we have something called “sampling” in music.
Let’s talk music geek for a bit, shall we?
All About Bits
In the computing world, strings of binary digits or bits are used to describe anything a computer does. Computers are able to manage entire strings of these at a time.
16 bits states that there are 16 binary digits in a word, so each digit in a string represents either a value of 0 or 1.
24 bits states there are 24 digits in a word, and this idea goes the same for higher-numbered bits.
A sample recorded at 16 bits can, therefore, contain over 65,000 (65,536) levels. Whereas 24 bits can contain over 16 million (16,777,216) unique levels.
The difference between 16-bit and 24-bit sounds huge — at least when you look at the number of available levels.
Dynamic Range: Key Differences Between Bits
But look at this difference in terms of dynamics:
Imagine the quietest whisper and the loudest bang in a concert. That’s essentially your dynamic range.
But although bigger bit depth is technically better (24-bit adding more ‘resolution’ compared to 16-bit), this added resolution doesn’t necessarily mean higher quality. It just means we can encode a larger dynamic range.
The term resolution might be a little misleading here since many think of it as being similar to adjusting the screen resolution of your computer monitor.
But when your turn down the bit depth of a file, you’ll get an increasing amount of low-level noise. This is kind of like tape hiss.
And that’s why dynamic range is sometimes referred to as signal to noise ratio (SNR or S/N). Though there are some differences between them.
Here is what Head-Fi said about a common misunderstanding regarding bit depth:
The only difference between 16-bit and 24-bit is 48 dB of dynamic range (8 bits x 6 dB = 48 dB) and nothing else.
This is not a question for interpretation or opinion. It is the provable, undisputed logical mathematics that underpins the very existence of digital audio.
Try watching this video to understand more about the difference in bit depth:
Hearing the Decibels
Now, the important question here is: Could you actually hear that 48 dB difference in music?
Researchers conducted a small study which revealed that even the most experienced subgroups (musicians, sound engineers, and hardware reviewers) couldn’t really tell the difference between original 24-bit music and the same files dithered down to 16 bits.
These were then fed into the DAC in the 24 bit container) in A-B listening tests.
And this is even if they had access to equipment costing more than $6,000.
This effectively reveals at least some interesting points to consider: even symphony orchestra recordings (that can have dynamic range greater than 60 dB) don’t really benefit from this technical advantage.
That additional 48 dB starts to sound almost laughable when we realize that some types of music today can have a dynamic range of just 12 dB.
Maybe the younger generation of music producers shared the meme ‘do you even compress bro?’ too much so it became the rule of the day?
More Complex Data
Joking aside, that’s not the whole story about bit depth. So far, we have only talked about simple data, such as single mono and stereo files.
But working with sound in the digital domain means we’re usually mixing multiple digital audio bitstreams to generate a single stereo or multichannel output.
And that’s where this additional resolution becomes handy.
Files that were captured with bigger dynamic range will have better signal to noise ratio and higher level of detail (smaller steps between points of amplitude).
Additionally, there is this funky thing called floating point. This term can be even more confusing, so take a deep breath first. Ready now?
As mentioned, the converters inside modern digital audio interfaces can theoretically handle 24-bit ADC-DAC resolution. And this is if we’re theoretically speaking.
Because technically speaking, there is probably no audio system in the world giving more than 20 clear bits of signal due to resistance and semiconductor noise characteristics.
Room for Enhancement
But that can still be enhanced through arithmetic calculation:
Using floating pt. binary arithmetic instead of fixed-point allows a far greater range of numbers to be represented using the same number of bits.
At a glance, the achieved a theoretical dynamic range of 1680 dB (24-bit, 32-bit floating) does not make a lot of sense as human hearing ranges from 0 dB (the threshold of silence) to about 150 dB (the threshold of pain).
Instead, you should think about it as added resolution. That is the accuracy with which analog amplitude can be represented in the digital domain.
Now, focus! Using floating pt. binary arithmetic will not protect you from clipping during recording, since the converter still only handles 24-bit audio fixed bit depth.
A 32-bit floating pt. calculation is significantly more accurate than a 24-type.
Data loss (which affects audio quality) is changed through rounding and signals will pass through plug-ins easily if the bit audio files themselves are encoded as 32-bit float.
This advantage is not limited to just encoding audio files during recording though.
The mixing engine bit depth inside a DAW software can also be set to use floating pt. calculation and that makes a big difference. By using floating pt. binary arithmetic, you are able to avoid problems like:
- clipping during rendering using plugins
- unnecessary noise introduced by plugin dithering
- rounding errors during signal processing
Want fool-proof exports of your mix?
Always export your files using floating-point calculation since these files can recover from clipping disasters. That’s assuming you used plug-ins that make use of floating pt. calculation.
Still with me? Now that bit depth is demystified, we really should have a look at sample rate and the speed of capturing audio signal next.
Sample Rate in 16-Bit vs. 24-Bit Audio
Sample rate is given in hertz (Hz) which is a derived unit of frequency.
A single hertz in a sample rate consists of a cycle that has amplitude values of 0 to +1, -1 and 0.
We humans perceive frequency of sound waves as pitch. But in sampling and sample rate, this essentially refers to the speed of capture.
A reel-to-reel tape recorder is a perfect analogy to understand this: The faster the recording speed, the better the reproduction quality.
Technically speaking, a machine could take a sample rate from the source signal at any speed that the ADC converter design allows.
But there is a certain minimum requirement that is directly connected to the frequency area that we humans are able to hear when it comes to sample rate.
This ranges roughly from 20 hertz to 20 kilohertz. It also varies from person to person due to individual physical properties.
This means that by design, our ears can only comprehend the frequencies that fall in between this area.
No upgrades available, folks.
Nyquist-Shannon Sampling Theorem
But to achieve capturing those frequencies convincingly in the digital domain and sample rate, we also need to follow a certain law called Nyquist-Shannon sampling theorem.
It states that the sampling frequency must be greater than twice the maximum frequency one wishes to reproduce.
Otherwise, our ears will tell the difference between analog source signal and its digital representation. This sets the minimum useful sample rate to roughly 44,000 hertz.
Additionally, to avoid a sampling problem in sample rate called aliasing (more about this later), these sampled signals must also be low-pass filtered.
Mathematically speaking, 44.1 kilohertz is ideal as it is the product of the squares of the first four prime numbers.
But the real reason why it was adopted is actually historical.
Back in the days of VHS, the limitations of circuitry used in now redundant VCR recording technology also placed this number to maximum of 44.1 kHz and it was widely adopted by major players in the industry.
In modern technology, the sampling speed and sample rate can be significantly better than this and sample rates such as 48, 88.2, 96 or even 192 kilohertz are now commonly available.
Now, do have that DJ friend who always claims LP’s sound superior to digital recordings?
Although not obvious, there is one theory out there claiming he might be onto something, especially with sample rate.
It’s that analog recording and playback equipment sample rates can indeed reach up to 50 kilohertz, which is 6 kHz more than our minimum requirement and incidentally superior to 44.1 kHz.
This is known as the standard of CD quality.
This sample rates theory comes from an idea that, although we cannot hear it, the audio energy still exists on lower frequencies.
Therefore, it will affect the listening experience positively in an all-analog playback system.
All About Vinyl
Whoa, hold your horses! But can vinyl performance shine in other areas apart from sample rate?
Here we need to understand the technical limitation of direct-cut vinyl record, which may only reach a dynamic range of roughly 70 dB. Whereas a compact disc can reach 98 dB.
The Thing About Digital Sampling
Digital sampling is not completely problem-free, though.
We mentioned earlier about the noise produced by the semiconductor components that will be always present.
But sometimes noise is good. Almost any kind of signal processing causes a reduction of bits, and prompts the need to use dithering, which essentially adds noise to the signal.
This has the effect of spreading the many short-term errors across the audio spectrum as broadband noise.
A typical usage example would be reducing bit depth of an audio file from 24 bits to 16 bits. Dithering in this case would be done by adding noise of a level less than the least-significant bit before rounding to 16 bits.
There is also aliasing, which occurs when the audio signal contains frequencies that are inaudible to humans.
For example, using a 32,000 Hz sample rate will include frequency components above 16,000 Hz (the Nyquist frequency for this sampling rate).
This will cause sample rate aliasing when the music is reproduced by a digital to analog (DAC).
To prevent this, an anti-aliasing filter is used to remove these components prior to sampling for sample rate.
And last but not least, digital capturing is a process where timing accuracy is extremely important for sample rates.
This timing is handled by a word clock signal, which acts like a conductor that provides a periodic timing signal to all the parts of a digital audio system in order to have each process triggered at a precise moment.
When a sample rate is taken 44,100 times within one second, no matter how high-end the gear you have, there will be errors or small deviations in the ideal timing.
This sample rates error is called jitter and it creates minute but audible deficiencies in sound quality.
Check out this video for more information:
Audio Bit Depth and Sample Rate
Audio bit depth and sample rate determine how much bandwidth (or data per second) is required to transmit the file and also how much storage space it will reserve when stored in digital form.
By looking at the sample rate figures below, we can easily adapt this information to real-life situations, such as streaming audio content from internet or exporting a mix to a storage device.
|Settings||Number of tracks (mono)||Bitrate||Session size per minute||Session size per hour|
|16 bit, 44.1 kHz||24||16.944 Kbps||126.96 MB||7617.6 MB|
|16 bit, 48 kHz||24||18.432 Kbps||138.24 MB||8294.4 MB|
|16 bit, 96 kHz||24||36.96 Mbps||276 MB||16560 MB|
|24 bit, 44.1 kHz||24||1.06 Mbps||190.56 MB||11433.6 MB|
|24 bit, 48 kHz||24||27.6 Mbps||207.36 MB||12441.6 MB|
|24 bit, 96 kHz||24||2.3 Mbps||415.2 MB||24912 MB|
Bear in mind, that these sample rate numbers represent a single stereo file.
Remember that 3G/4G internet connection you have in your mobile?
While it can be relatively fast it will most likely still transfer data at under 100 Mbit/s speeds or even more likely somewhere between 10-40 Mbit/s.
This means that transferring a common cd-quality (44.1 kHz, 16 bit) uncompressed audio stream over a mobile internet connection would already take a big share of the available bandwidth.
And then there is the necessity of buffering.
Compressing audio content for streaming suddenly makes a lot of sense, doesn’t it? But the file generated during 1 minute of recording with these settings is still small by today’s standards.
Using a high sound quality recording device and 96 kilohertz 24-bit audio quality would result in a file size that is over three times bigger.
This happens, because the speed of capture is faster and also because the converter is capturing a larger dynamic range. 192-kilohertz setting would of course grow the files even further.
Mono and Stereo
A typical live tracking session, where you would have 24 mono tracks armed and recording simultaneously.
When we look at recording a live band or other similar tracking situation, we can clearly see how the session size (need to allocate more space) grows.
Remembering that a typical recording session includes multiple takes and possible further processing (which will consume even more space on your storage device), it’s not difficult to predict at least a minimum session size.
The tracks in the example are mono but given that data translates directly into various other channel configurations (2 mono channels = 1 stereo, etc.).
Need to know the exact file size for your selected settings? You can use this audio file size calculator: https://toolstud.io/video/audiosize.php
In today’s world of fast computers and seemingly unlimited disc space, it is wise to ask what would be the ideal quality you should be using for your sample rate recordings and in your mixing process?
This question doesn’t have a single right or wrong answer, though.
Firstly, we should point out the biggest limitation that might still come as a surprise for some: Your computer or recording device needs to have a really fast storage device.
This is in order to manage the high bandwidth requirement when recording or playing back a multitrack project using 24 bit 96 kHz high sound quality setting.
Just think about 20 stereo tracks resulting in 87,8 Mbit/s bandwidth requirement and compare it with a typical 7200 RPM computer hard drive: The write/read speeds will usually fall in between 80-160 Mbit/s.
Not a big surprise that writing 20 stereo channels would prove impossible.
The audio interface (SATA, USB2/3, MicroSD, etc.) also makes a big difference.
Remember: What the audio converter chip essentially creates is a stream of data that must be written to the audio interface storage medium in almost real time!
The trick that professionals use for audio interface is to split the load onto separate physical drives.
This makes a lot of sense as computers typically need to handle reading and writing operations of the operating system too.
Portable recording devices are not usually prone to this since they typically handle much smaller number of audio tracks compared to DAW computers.
Also, their operating system and audio interface is usually quite different by design.
Audio files can hog a lot of storage space.
Recording various takes, bringing in multichannel data and exporting mixes will always consume space.
Sample libraries and sample-based virtual instruments usually require a lot of space and investing in a separate high-speed drive for the sound library make a lot of sense.
Also, while the file size grows exponentially when using higher sound quality settings, the perceived quality might not necessarily feel worth the extra bandwidth/space requirement.
Always remember to look at these settings in the context of what you are trying to accomplish and what is an acceptable quality to work with.
Downfalls of High Quality Mixing
A simple, failsafe approach here would be to always record at higher quality, mix using a high quality setting, and then convert the final mix to whatever quality is necessary. But even that is not always practical.
Every time you are using a high-quality setting in your DAW you will also generate a higher CPU load to your computer.
Mixing a simple radio jingle using a 32-bit float 192 kHz engine would seem quite overkill for obvious reasons.
Take Advantage of Floating Point
Floating pt. recording and mixing are great features to have as they potentially give you more freedom through that added resolution.
But these features are not available everywhere: We have seen it in many digital audio workstations but it rarely exists in any portable recorders. That’s unless you’re willing to part with a serious lump of cash.
Nowadays, it seems many hobbyist and semi-professional producers out there are quite happy going with so-called ‘budget’ solutions.
While that is perfectly fine (especially if you are not doing any recording from external sources), there are still obvious reasons why anyone even remotely serious about sound should consider investing into good quality hardware.
Still, even the simplest DAC chip integrated to your modern computer’s motherboard would prove sufficient for many.
When you export a mix (create a file) from modern DAW, this file will be written directly to storage medium and doesn’t need to travel through DAC.
Akai MPC 60
Why did the Akai MPC60’s 12 bit sound turn out so popular?
Well, Akai used 16 bit ADC/DAC converter chips on board, but the software was capable of writing the sampled material in a special non-linear 12-bit audio format.
The maximum sample rate was 40 kHz. It was a competent level for sample rate.
The Burr-Brown PCM54HP (DAC) and PCM77P (ADC) converter chips used also provided a certain character (together with other less significant design features) that many later MPC revisions or other hardware samplers couldn’t quite match.
You could, quite reasonably, claim that the sound quality or character it was able to produce matched quite perfectly for the style of music that really benefited from that extra ‘grit’.
April 19, 2022 – updated links, minor content edits