96 kHz: From Theory to the Final Master
96. Two digits surfacing in almost every serious conversation about professional audio, dividing opinion right down the middle. Marketing hype or genuine advancement? Real-world benefit or a well-polished sales pitch?
Physics, algorithms and hands-on experience all point to answers far more nuanced and interesting than the usual debate suggests.
96 kHz is a decision touching every stage of the workflow: analogue-to-digital conversion, plugin behaviour, the way transients are captured, and how your master is ultimately delivered to hi-res streaming platforms.
What 96 kHz actually changes in day-to-day practice is exactly what we'll explore here, with physics as our guide and real-world experience as our reference.
Nyquist, Anti-Aliasing and Converters
The Nyquist-Shannon Theorem
To capture an audio signal accurately, the sample rate must be at least twice the highest frequency present. This is the Nyquist-Shannon theorem, and everything else follows from it.
This principle defines the Nyquist frequency, the upper limit of what the system can reproduce cleanly:
44,100 Hz → Nyquist: 22,050 Hz 96,000 Hz → Nyquist: 48,000 Hz
Any frequency exceeding this limit doesn't simply vanish. It folds back into the audible spectrum as parasitic, inharmonic artefacts. That's aliasing.
The Anti-Aliasing Filter
To prevent this fold-back, a low-pass filter is applied before conversion, tasked with removing anything approaching the Nyquist frequency. This is the anti-aliasing filter.
At 44,100 Hz, that filter must cut very steeply just above 20 kHz, right at the edge of the audible spectrum. This sharp roll-off introduces phase distortion and a subtle colouration in the high frequencies.
At 96,000 Hz, there's 26,000 Hz of headroom before reaching Nyquist. The filter can roll off gradually, with a gentle slope, leaving the audible spectrum entirely untouched.
44.1 kHz → steep filter at ~22 kHz → possible colouration from 18–20 kHz 96 kHz → gentle filter at ~48 kHz → audible spectrum fully preserved
Delta-Sigma Converters and Noise Shaping
Modern converters, such as the Prism Sound ADA-8XR, the Lavry Engineering SAVITR, the Antelope Pure 2, and the Metric Halo ULN-8 mkIV, almost universally employ a Delta-Sigma (ΔΣ) architecture. They operate internally at very high frequencies, sometimes 64x or 128x the target sample rate, before stepping back down. This is internal oversampling.
Then there's noise shaping: quantisation inherently generates noise. Noise shaping pushes it up into the higher frequencies, well clear of the sensitive audible range.
At 96 kHz, the converter has twice the spectral space to distribute that noise. Noise shaping can work far more aggressively, leaving the audible spectrum cleaner and the noise floor lower on quiet signals.
This is precisely where confusion between sample rate and dynamic range tends to arise. The effect is real but indirect: it's the converter design working in combination with sample rate that produces this result, not sample rate alone.
Clipping, Harmonics and Aliasing
In practice, it becomes immediately apparent. Clipping an AD converter at 44.1 kHz is simply not the same experience as at 96 kHz. At 96 kHz, you can push the converter harder before clipping generates audible artefacts, not a matter of perception, but physics.
When a signal clips, it generates harmonics, multiples of the fundamental frequency. That's the very nature of distortion:
Fundamental : 10 kHz Harmonics : 20 kHz, 30 kHz, 40 kHz, 50 kHz...
At 44.1 kHz, those harmonics exceed the Nyquist frequency quickly, folding back into the audible spectrum as inharmonic, dissonant artefacts:
30 kHz → alias at ~14 kHz 40 kHz → alias at ~4 kHz 50 kHz → alias at ~5.9 kHz
Artefacts appear quickly, making the saturation sound inharmonic and aggressive to the ear.
At 96 kHz, those same harmonics have far more spectral room before hitting Nyquist, contained or eliminated by the anti-aliasing filter well above the audible range:
30 kHz → stays at 30 kHz 40 kHz → stays at 40 kHz 50 kHz → alias at ~46 kHz
Push the converter harder, and saturation stays clean, transparent, musical.
This is also exactly what drives oversampling in saturation and limiting plugins, developers artificially replicating this behaviour by running their algorithms at a higher internal frequency. Working natively at 96 kHz delivers this advantage directly at the converter level.
Microphones, Preamps and Signal Chain Transparency
Microphones and preamps operate in the analogue domain. They have no awareness of session sample rate. Yet 96 kHz captures every nuance of their character far more accurately, and that's where the conversation gets genuinely interesting.
Microphone Bandwidth
Some high-quality microphones extend well beyond 20 kHz:
Neumann U87 → captures up to ~20 kHz Schoeps MK4 → captures up to ~40 kHz DPA 4006 → captures up to ~40 kHz
At 44.1 kHz, everything above 20 kHz is eliminated during conversion by the internal decimation filter. At 96 kHz, it is preserved in full.
Transient Time Resolution
This is arguably the most tangible and immediately audible argument. A snare drum, an acoustic guitar, a percussive consonant in a vocal, all generate ultra-fast transients whose energy extends high up the frequency spectrum.
At 96 kHz, temporal precision is double that of 44.1 kHz:
44.1 kHz → 1 sample = ~22.6 µs 96 kHz → 1 sample = ~10.4 µs
That precision translates directly into improved definition, punch, and air on acoustic sources. What many engineers perceive intuitively on first listen finds its physical explanation right here.
The Converter as the Revealing Link
Working at 96 kHz with a high-quality converter means the true character of every microphone and preamp in the chain arrives in the digital domain without compromise. Every nuance, every subtlety, every detail of the analogue signal is captured with maximum precision.
It's worth stressing: the converter remains the critical link in this chain. An exceptional microphone and an exceptional preamp deserve a converter that can do them justice. A well-chosen converter brings the entire chain together: coherent, consistent, uncompromised.
Plugins, Oversampling and Linear Phase EQ
Aliasing in Plugins
Plugins generating harmonic distortion, saturators, non-linear compressors, limiters, produce harmonics governed by the same physics as an AD converter. Here though, the process takes place entirely within the digital domain, inside the plugin itself.
Oversampling
The solution developers reached for is oversampling: the plugin runs its processing internally at 2x, 4x, 8x, sometimes 16x the session sample rate, then steps back down to the native rate. Unwanted harmonics are generated within a spectral space wide enough to be cleanly filtered before returning to the session spectrum.
It's the same principle at work in Delta-Sigma converters, replicated in software and carrying a CPU overhead to match.
Working natively at 96 kHz reduces the problem at source. A plugin running at 96 kHz without oversampling enabled will often perform better than the same plugin running at 44.1 kHz with 2x oversampling. For plugins that support oversampling, enabling it at 96 kHz pushes artefacts even further from the audible range.
Linear Phase EQ
Linear phase equalisation is particularly sensitive to sample rate. A linear phase EQ processes the signal symmetrically in time, generating pre-ringing: an audible impulse just ahead of each transient. Its duration is directly influenced by temporal precision.
At 44.1 kHz, it runs longer and more perceptibly, smearing transient attacks and reducing mix clarity.
At 96 kHz, finer temporal precision shortens it considerably:
44.1 kHz → longer pre-ringing → less precise attacks 96 kHz → shorter pre-ringing → attacks preserved
Particularly noticeable on percussive sources and in the low frequencies, where linear phase EQ sees the most use in mastering. Curve precision improves too: at 96 kHz, the EQ has twice as many calculation points across the spectrum, yielding smoother, more natural corrections.
Sample Rate Conversion: A Critical Step
96 kHz is a session reality, but it doesn't always match delivery specifications. Streaming platforms, which now account for the vast majority of distribution, accept files ranging from 44.1 kHz to 96 kHz; the CD format is limited to 44.1 kHz by the Red Book standard, and most broadcast platforms work at 48 kHz. SRC (Sample Rate Conversion) is therefore an unavoidable step in most workflows.
The 96 → 44.1 kHz Ratio
Converting from 96 kHz to 44.1 kHz is one of the most complex conversions in common use. Their ratio is irrational:
96,000 / 44,100 = 2.176870...
To compute it accurately, the algorithm must work across a very large number of samples before completing a full cycle. The longer that cycle, the more precise the calculation and the higher the quality of the output. Which is why algorithm quality is absolutely critical on this particular ratio.
A poor algorithm on this ratio introduces artefacts that are subtle but audible: lost definition in the highs, and sometimes a colouration of the low end.
The 96 → 48 kHz Ratio
Converting from 96 kHz to 48 kHz, by contrast, rests on a perfectly integer ratio:
96,000 / 48,000 = 2.000000
The algorithm simply drops every other sample. No interpolation, no artefacts. The cleanest conversion possible.
For broadcast or 48 kHz deliverables, working natively at 96 kHz offers a clear advantage: sample rate conversion with zero compromise.
Operation Order
A frequently overlooked but fundamental point: in any mastering chain, SRC must always come before dithering, never after.
Dithering is a low-level noise added to the signal to mask quantisation artefacts during bit depth reduction. Running SRC after dithering causes the algorithm to treat dither noise as audio signal, redistributing it unpredictably across the spectrum. The result: dithering that is ineffective and potentially damaging.
Correct order : SRC → Dithering → final file Incorrect order : Dithering → SRC → final file
Hi-Res Streaming, FLAC and ALAC
The Hi-Res Streaming Landscape
Hi-res streaming platforms have expanded considerably in recent years. Tidal, Qobuz, Apple Music, and Amazon Music HD now offer content at 24-bit / 96 kHz, and in some cases 24-bit / 192 kHz for select catalogues. A development that directly affects sound engineers and fundamentally changes the conversation around master delivery.
A master delivered at 96 kHz can now reach the listener at its native resolution, with no degradation and no sample rate conversion. End-to-end continuity that simply didn't exist before.
How FLAC and ALAC Work
FLAC (Free Lossless Audio Codec) and ALAC (Apple Lossless Audio Codec) are both lossless audio compression formats. Unlike MP3 or AAC, they sacrifice no audio information. The decompressed file is absolutely identical to the source, sample for sample.
Both rely on two complementary mechanisms. The first is linear predictive coding (LPC): the algorithm analyses recurring patterns in the signal, predicting subsequent samples from previous ones. Only the difference between prediction and reality is stored, significantly reducing the amount of data to encode.
The second is Rice coding: residuals are compressed using an algorithm optimised for the value distributions typical of audio. The result: file sizes reduced by 40 to 60% compared to the original WAV, with no loss of information whatsoever.
WAV 24-bit / 96 kHz → ~90 MB for 5 minutes FLAC 24-bit / 96 kHz → ~40 MB for 5 minutes
From Session to Listener
Perhaps the strongest argument in favour of 96 kHz today. The complete chain now exists, end to end:
96 kHz Session → 24-bit / 96 kHz Master → FLAC / ALAC → Hi-res Platform → Listener's DAC
A listener with a quality DAC and a decent listening system receives exactly what the engineer captured and shaped in the studio. Every decision made at the source, from microphone to preamp, converter to sample rate, carries through entirely to the final listening experience.
A genuine opportunity for sound engineers: to deliver work without compromise, from the recording to the listener's playback system.
Julien Courtois