ai audio - Present Communications

The old truth has always and will always be true (probably) and that is, if you want great audio, get the microphone as close to the source as possible. Not only that, but always record in a space that has as little noise as possible. In this particularly post, I’m talking about recording the voice in particular.

Noise is a distraction, it’s annoying. And it actually makes speech less intelligible.

Over the last few years, especially since lockdown we’ve seen the rise in automated audio processing in platforms like Teams, Zoom and Webex and they all work a little differently.

First, lets talk about compression. This is the process whereby the loudest audio is made quieter. That means that all ‘speech’ is similar in level, there is less dynamic range. If that audio is then normalised, that is that it’s increased to the ‘standard’ level, then all speech is similar in volume and as loud as all over speech.

But what happens if there isn’t only speech in the audio, what happens if there is noise? If the level of the speech is increased, then so is the noise, and we end up with noisey audio.

Introducing, the noise gate. This kind of does what it says on the tin. The noise gate is set to the threshold so that noise that is quieter than the threshold is muted, and voice that is louder is unmuted.

Problem solved….

But how do we know what the threshold should be. Automated systems will look for a constant level, the ‘noise floor’ and will set the threshold around there. Now… we play in music which is so heavily compressed that the platform thinks it’s noise (it often is). And now the processor mutes EVERYTHING.

Where are we going with this? AI.

New AI technology actually listens to the audio, detects the voices. More than that, it listens to the words so it knows what is speech and what isn’t. That way, it can actually eliminate real noise but keep speech, even where they’re at a similar level. In fact, it can then isolate the speech and manipulate it to make it even more intelligible.

We could take it one step further and change the voice, from male to female or vice versa for example. But lets leave that one for another post…