Noise reduction is typically the first step in the audio cleanup process. At issue is how well the software used can distinguish between speech and non-speech elements. To give an idea of the challenge involved, the consonant and sibilant portions of speech are essentially noise themselves, interspersed between vowel sounds. iZotope's RX Advanced series, the industry standard in audio repair software, offers a number of noise reduction modules, each taylored to specific tasks.
A relatively new advancement in the iZotope audio repair series is 'spectral recovery,' whereby high and low-frequency bands lost in the transmission of speech over poor internet connections is possible. This is especially useful when working with dialogue sent over conferencing applications like Zoom.
Plotting the audio waveform in what is called spectograph format is a unique and innovative way of viewing an audio file. Here, time is plotted along the X axis, frequencies up and down along the Y axis, while amplitude is indicated by differences in color and intensity, much the way a Mandelbrot set is generated. This manner of representing the waveform allows the technician to visually identify what to preserve and what can be removed, further aiding in the suppression of obtrusive noises. It is of particular value in certain post-production applications where an otherwise perfect on-location take was ruined by, for example, a police siren or a dog barking. It is now possible to identify the unwanted element visually and simply delete it, leaving only pristine audio to resync to the video and/or deliver back to the client.
After completing whatever noise reduction and repair can be done on the audio file, a comparison is made to the original recording to confirm that the steps taken have actually improved the audibility of the speech content. The importance of refering back to the original cannot be overemphasized, as the application of noise reduction, if not done carefully, can introduce unwanted artifacts into the voice portion of the audio, actually worsening intelligibility. Once the technician is satisfied with the work up to this point, the next stop is enhancement.
For the enhancement part of the process, at Clarion Labs we generally move the file into Avid Pro Tools software, the long-standing industry-standard for digital audio recording and editing. Here, more classical techniques are employed: frequencies above and below the voice spectrum may be filtered out, providing there is noticeable improvement in intelligibility. Often reverberation or room ambience, which tend to obscure clear speech, can be supressed. Finally, equalization targeting the voice spectrum is applied in conjunction with compression to boost low-level or quiet passages. With the flexibility available in Pro Tools, an almost unlimited range of plugin choices and parameter adjustments are available pursuant to enhancing the speech component of the audio.
Audio cleanup and enhancement is not a 'paint-by-numbers' affair. The incorrect or excessive application of processing at any step can actually worsen the fidelity of the voice portion of the recording. It takes the discerning ear of a trained audio professional to carefully apply the various processes in the right order and in the proper manner to coax the recording along toward maximum intelligibility.