Published on 08/06/2026

96 kHz: From Theory to the Final Master

96. Two digits surfacing in almost every serious conversation about professional audio, dividing opinion right down the middle. Marketing hype or genuine advancement? Real-world benefit or a well-polished sales pitch?

Physics, algorithms and hands-on experience all point to answers far more nuanced and interesting than the usual debate suggests.

96 kHz is a decision touching every stage of the workflow: analogue-to-digital conversion, plugin behaviour, the way transients are captured, and how your master is ultimately delivered to hi-res streaming platforms.

What 96 kHz actually changes in day-to-day practice is exactly what we'll explore here, with physics as our guide and real-world experience as our reference.

Nyquist, Anti-Aliasing and Converters

The Nyquist-Shannon Theorem

To capture an audio signal accurately, the sample rate must be at least twice the highest frequency present. This is the Nyquist-Shannon theorem, and everything else follows from it.

This principle defines the Nyquist frequency, the upper limit of what the system can reproduce cleanly:

44,100 Hz → Nyquist: 22,050 Hz
96,000 Hz → Nyquist: 48,000 Hz

Any frequency exceeding this limit doesn't simply vanish. It folds back into the audible spectrum as parasitic, inharmonic artefacts. That's aliasing.

The Anti-Aliasing Filter

To prevent this fold-back, a low-pass filter is applied before conversion, tasked with removing anything approaching the Nyquist frequency. This is the anti-aliasing filter.

At 44,100 Hz, that filter must cut very steeply just above 20 kHz, right at the edge of the audible spectrum. This sharp roll-off introduces phase distortion and a subtle colouration in the high frequencies.

At 96,000 Hz, there's 26,000 Hz of headroom before reaching the Nyquist frequency. The filter can roll off gradually, with a gentle slope, leaving the audible spectrum entirely untouched.

44.1 kHz → steep filter at ~22 kHz → possible colouration from 18–20 kHz
96 kHz → gentle filter at ~48 kHz → audible spectrum fully preserved

Delta-Sigma Converters and Noise Shaping

Modern converters, such as the Prism Sound ADA-8XR, the Lavry Engineering SAVITR, the Antelope Pure 2, and the Metric Halo ULN-8 mkIV, almost universally employ a Delta-Sigma (ΔΣ) architecture. They operate internally at very high frequencies, sometimes 64x or 128x the target sample rate, before stepping back down. This is internal oversampling.

Then there's noise shaping: quantisation inherently generates noise. Noise shaping pushes it up into the higher frequencies, well clear of the sensitive audible range.

At 96 kHz, the converter has twice the spectral space to distribute that noise. Noise shaping can work far more aggressively, leaving the audible spectrum cleaner and the noise floor lower on quiet signals.

This is precisely where confusion between sample rate and dynamic range tends to arise. The effect is real but indirect: it's the converter design working in combination with sample rate that produces this result, not sample rate alone.

Clipping, Harmonics and Aliasing

In practice, it becomes immediately apparent. Clipping an AD converter at 44.1 kHz is simply not the same experience as at 96 kHz. At 96 kHz, you can push the converter harder before clipping generates audible artefacts, not a matter of perception, but physics.

When a signal clips, it generates harmonics, multiples of the fundamental frequency. That's the very nature of distortion:

Fundamental : 10 kHz
Harmonics : 20 kHz, 30 kHz, 40 kHz, 50 kHz...

At 44.1 kHz, those harmonics exceed the Nyquist frequency quickly, folding back into the audible spectrum as inharmonic, dissonant artefacts:

30 kHz → alias at ~14 kHz
40 kHz → alias at ~4 kHz
50 kHz → alias at ~5.9 kHz

Artefacts appear quickly, making the saturation sound inharmonic and aggressive to the ear.

At 96 kHz, those same harmonics have far more spectral room before hitting the Nyquist frequency, contained or eliminated by the anti-aliasing filter well above the audible range:

30 kHz → stays at 30 kHz
40 kHz → stays at 40 kHz
50 kHz → alias at ~46 kHz

Push the converter harder, and saturation stays clean, transparent, musical.

This is also exactly what drives oversampling in saturation and limiting plugins, developers artificially replicating this behaviour by running their algorithms at a higher internal frequency. Working natively at 96 kHz delivers this advantage directly at the converter level.

Microphones, Preamps and Signal Chain Transparency

Microphones and preamps operate in the analogue domain. They have no awareness of session sample rate. Yet 96 kHz captures every nuance of their character far more accurately, and that's where the conversation gets genuinely interesting.

Microphone Bandwidth

Some high-quality microphones extend well beyond 20 kHz:

Neumann U87 → captures up to ~20 kHz
Schoeps MK4 → captures up to ~40 kHz
DPA 4006 → captures up to ~40 kHz

At 44.1 kHz, everything above 20 kHz is eliminated during conversion by the internal decimation filter. At 96 kHz, it is preserved in full.

Transient Time Resolution

This is arguably the most tangible and immediately audible argument. A snare drum, an acoustic guitar, a percussive consonant in a vocal, all generate ultra-fast transients whose energy extends high up the frequency spectrum.

At 96 kHz, temporal precision is double that of 44.1 kHz:

44.1 kHz → 1 sample = ~22.6 µs
96 kHz → 1 sample = ~10.4 µs

That precision translates directly into improved definition, punch, and air on acoustic sources. What many engineers perceive intuitively on first listen finds its physical explanation right here.

The Converter as the Revealing Link

Working at 96 kHz with a high-quality converter means the true character of every microphone and preamp in the chain arrives in the digital domain without compromise. Every nuance, every subtlety, every detail of the analogue signal is captured with maximum precision.

It's worth stressing: the converter remains the critical link in this chain. An exceptional microphone and an exceptional preamp deserve a converter that can do them justice. A well-chosen converter brings the entire chain together: coherent, consistent, uncompromised.

Plugins, Oversampling and Linear Phase EQ

Aliasing in Plugins

Plugins generating harmonic distortion, saturators, non-linear compressors, limiters, produce harmonics governed by the same physics as an AD converter. Here though, the process takes place entirely within the digital domain, inside the plugin itself.

Oversampling

The solution developers reached for is oversampling: the plugin runs its processing internally at 2x, 4x, 8x, sometimes 16x the session sample rate, then steps back down to the native rate. Unwanted harmonics are generated within a spectral space wide enough to be cleanly filtered before returning to the session spectrum.

It's the same principle at work in Delta-Sigma converters, replicated in software and carrying a CPU overhead to match.

Working natively at 96 kHz reduces the problem at source. A plugin running at 96 kHz without oversampling enabled will often perform better than the same plugin running at 44.1 kHz with 2x oversampling. For plugins that support oversampling, enabling it at 96 kHz pushes artefacts even further from the audible range.

Linear Phase EQ

Linear phase equalisation is particularly sensitive to sample rate. A linear phase EQ processes the signal symmetrically in time, generating pre-ringing: an audible impulse just ahead of each transient. Its duration is directly influenced by temporal precision.

At 44.1 kHz, it runs longer and more perceptibly, smearing transient attacks and reducing mix clarity.

At 96 kHz, finer temporal precision shortens it considerably:

44.1 kHz → longer pre-ringing → less precise attacks
96 kHz → shorter pre-ringing → attacks preserved

Particularly noticeable on percussive sources and in the low frequencies, where linear phase EQ sees the most use in mastering. Curve precision improves too: at 96 kHz, the EQ has twice as many calculation points across the spectrum, yielding smoother, more natural corrections.

Sample Rate Conversion: A Critical Step

96 kHz is a session reality, but it doesn't always match delivery specifications. Streaming platforms, which now account for the vast majority of distribution, accept files ranging from 44.1 kHz to 96 kHz; the CD format is limited to 44.1 kHz by the Red Book standard, and most broadcast platforms work at 48 kHz. SRC (Sample Rate Conversion) is therefore an unavoidable step in most workflows.

The 96 → 44.1 kHz Ratio

Converting from 96 kHz to 44.1 kHz is one of the most complex conversions in common use. Their ratio is irrational:

96,000 / 44,100 = 2.176870...

To compute it accurately, the algorithm must work across a very large number of samples before completing a full cycle. The longer that cycle, the more precise the calculation and the higher the quality of the output. Which is why algorithm quality is absolutely critical on this particular ratio.

A poor algorithm on this ratio introduces artefacts that are subtle but audible: lost definition in the highs, and sometimes a colouration of the low end.

The 96 → 48 kHz Ratio

Converting from 96 kHz to 48 kHz, by contrast, rests on a perfectly integer ratio:

96,000 / 48,000 = 2.000000

The algorithm simply drops every other sample. No interpolation, no artefacts. The cleanest conversion possible.

For broadcast or 48 kHz deliverables, working natively at 96 kHz offers a clear advantage: sample rate conversion with zero compromise.

Operation Order

A frequently overlooked but fundamental point: in any mastering chain, SRC must always come before dithering, never after.

Dithering is a low-level noise added to the signal to mask quantisation artefacts during bit depth reduction. Running SRC after dithering causes the algorithm to treat dither noise as audio signal, redistributing it unpredictably across the spectrum. The result: dithering that is ineffective and potentially damaging.

Correct order : SRC → Dithering → final file
Incorrect order : Dithering → SRC → final file

Hi-Res Streaming, FLAC and ALAC

The Hi-Res Streaming Landscape

Hi-res streaming platforms have expanded considerably in recent years. Tidal, Qobuz, Apple Music, and Amazon Music HD now offer content at 24-bit / 96 kHz, and in some cases 24-bit / 192 kHz for select catalogues. A development that directly affects sound engineers and fundamentally changes the conversation around master delivery.

A master delivered at 96 kHz can now reach the listener at its native resolution, with no degradation and no sample rate conversion. End-to-end continuity that simply didn't exist before.

How FLAC and ALAC Work

FLAC (Free Lossless Audio Codec) and ALAC (Apple Lossless Audio Codec) are both lossless audio compression formats. Unlike MP3 or AAC, they sacrifice no audio information. The decompressed file is absolutely identical to the source, sample for sample.

Both rely on two complementary mechanisms. The first is linear predictive coding (LPC): the algorithm analyses recurring patterns in the signal, predicting subsequent samples from previous ones. Only the difference between prediction and reality is stored, significantly reducing the amount of data to encode.

The second is Rice coding: residuals are compressed using an algorithm optimised for the value distributions typical of audio. The result: file sizes reduced by 40 to 60% compared to the original WAV, with no loss of information whatsoever.

WAV 24-bit / 96 kHz → ~90 MB for 5 minutes
FLAC 24-bit / 96 kHz → ~40 MB for 5 minutes

From Session to Listener

Perhaps the strongest argument in favour of 96 kHz today. The complete chain now exists, end to end:

96 kHz Session → 24-bit / 96 kHz Master → FLAC / ALAC → Hi-res Platform → Listener's DAC

A listener with a quality DAC and a decent listening system receives exactly what the engineer captured and shaped in the studio. Every decision made at the source, from microphone to preamp, converter to sample rate, carries through entirely to the final listening experience.

A genuine opportunity for sound engineers: to deliver work without compromise, from the recording to the listener's playback system.

Julien Courtois

Back to blog