Thanks for posting. Here is a summary someone wrote on the article:
“Timing is of the essence.
The digitization process results in a mirror image with an inverted sideband beyond the Nyquist point.
This 'negative' image must be filtered out from the reconstructed analog form. Even if it is well beyond the sonic range of human perception.
Having a sampling rate of 44.1KHz limits the waveform maximum frequency to be recorded to half the sampling rate which corresponds to 22.05 KHz maximum.
Way before that point, the ultrasonic sidebands must be filtered out.
Having such a steep 'brickwall' filter causes aberrations to the signal. These are termed ringing or echoes before and after the cutoff point.
Natural echoes occur after a signal. However when converted back from digital to analog via the Digital to Analog Converter (DAC) these ringing echoes also occur symmetrically before the moment the signal pulse happens. This causes a coloration to the reconstructed sound.
This phenomenon can me mitigated by sampling at higher frequencies thus requiring a much 'better sounding' gentle filter for the ultrasonic frequencies components.
Thus timing is everything, in the Universe and also audio.
Takeaway: higher sampling rates sound better, due to the shifting of this Nyquist number way above human hearing bandwidth thus requiring less brickwall filters with a gentler slope to achieve the same results, with better (or less perceptible) audible ringing artifacts.
Excerpt:
In natural sound, echoes always occur after a sound—never before. This pre-echo is therefore unnatural; and while a continuous waveform will be reconstructed correctly, it is possible that the pre-echo might well be heard as a degradation with a discontinuous waveform, such as musical transients (see later).”