Google Duo is Using Machine Learning to Improve Call Quality

Alvin Wanjala

4 years ago

Google wants to better audio quality on Duo using Machine Learning technology from DeepMind, its neural network technology. Audio jitter is pretty universal, and if you’ve made a call via the internet, you know how it mostly affects calls – sometimes even not being able to decipher anything at all.

Because data from a sender has to be broken down into packets, to travel via the internet, these packets must be reassembled so the receiver can get a stream of video or audio.

Sometimes packets can be lost, and that’s where packet loss concealment (PLC) comes in – it fills in on the missing audio packets. PLC is not 100 percent perfect and can fail at times due to excessive jitter or temporary network glitches, says Google.

That is where Google’s new WaveNetEQ PLC system comes into play touted to help improve call quality. WaveNetEQ is based on Google’s DeepMind division and uses machine learning to replace the lost packets of audio with an artificially generated sound that sounds like natural human speech.

“The WaveNetEQ model is fast enough to run on a phone, while still providing state-of-the-art audio quality and more natural sounding PLC than other systems currently in use.”

Even while making calls in places while in noisy places, WaveNetEQ will augment the data by mixing it with a variety of background noises.

Currently, “99% of all Google Duo calls need to deal with packet losses, excessive jitter or network delays” – that’s almost all the calls made via Duo.

Among these, “20% lose more than 3% of the total audio duration due to network issues, and 10% of calls lose more than 8%.”

To train WaveNetEQ, the company says it used over 100 speakers in 48 different languages, so its well versant with general human speech.