Today Google announced the open source Beta of Lyra, an audio decoder that uses machine learning to generate high-quality voice calls. Lyra’s code and demo are currently available on GitHub, and can compress raw audio to 3 kilobits per second while being comparable in quality to other mainstream codecs.
Mobile connectivity has grown steadily over the past decade, but the explosive growth of computing power on devices has outpaced reliable, fast Internet access. Even in areas with reliable connections, the advent of “anytime, anywhere” work and telecommuting is straining data constraints. According to BroadbandNow, nearly 90 of the top 200 U.S. cities experienced broadband slowdown during the COVID-19 because of bandwidth constraints.
Google believes Lyra could have a wide range of applications, from archiving large amounts of voice to saving batteries to easing network congestion in emergency situations. Lyra’s architecture is divided into two parts, an encoder and a decoder. When someone speaks into their phone, the encoder picks up unique attributes, called features, from their speech. Lyra extracts these features in units of 40 milliseconds, compresses them and sends them over the network. The decoder’s job is to convert these features back into an audio waveform that can be played back to the listener’s phone.
According to Google, Lyra’s architecture is similar to traditional audio codecs, which form the backbone of Internet communications. But these traditional codecs are based on digital signal processing technology, and Lyra’s key advantage comes from its decoder’s ability to reconstruct high-quality signals.
In a blog post, Google Chrome engineers Andrew Storus and Michael Chinen wrote: “We are excited to see Lyra, known for its creativity in the open source community, being applied to Lyra in order to come up with more unique and impact-ful applications. We [want] to be able to get feedback as quickly as possible.”