Microsoft Using AI For Noise Suppression in Teams

Microsoft is working on using artificial intelligence (AI) to improve the sound quality of meetings in Teams....
Microsoft Using AI For Noise Suppression in Teams
Written by Matt Milano
  • Microsoft is working on using artificial intelligence (AI) to improve the sound quality of meetings in Teams.

    Microsoft Teams has been experiencing significant growth, both before and during the pandemic, as it takes on its chief rival Slack. As millions of people shelter in place and work from home, chat and videoconferencing software has become their lifeline to the outside world for work, association, family time and more.

    Unfortunately, one of the biggest irritations with videoconferencing is often the background noise—the cat meowing, dog barking, child playing or significant other watching TV. Now Microsoft is planning on using AI and machine learning to tackle the problem.

    As Robert Aichner, Microsoft Teams group program manager, told VentureBeat, the issue lies in cancelling non-stationary vs stationary noise. Stationary noise is constant, such as a computer’s fan. As such, stationary noise is relatively easy to suppress and Microsoft’s products, such as Teams and Skype, already do that. The challenge is suppressing non-stationary noise, such as a dog barking, a car horn blowing, or someone else in the room suddenly making noise.

    “That is not stationary,” Aichner explained. “You cannot estimate that in speech pauses. What machine learning now allows you to do is to create this big training set, with a lot of representative noises.”

    This is where machine learning comes, training the system using good and bad data examples, to help it better understand what needs to be filtered.

    “We train a model to understand the difference between noise and speech, and then the model is trying to just keep the speech,” Aichner continues. “We have training data sets. We took thousands of diverse speakers and more than 100 noise types. And then what we do is we mix the clean speech without noise with the noise. So we simulate a microphone signal. And then you also give the model the clean speech as the ground truth. So you’re asking the model, ‘From this noisy data, please extract this clean signal, and this is how it should look like.’ That’s how you train neural networks [in] supervised learning, where you basically have some ground truth.”

    The in-depth report at VentureBeat is a fascinating read, and shows what is possible as companies continue to use AI and machine learning across applications.

    Get the WebProNews newsletter delivered to your inbox

    Get the free daily newsletter read by decision makers

    Subscribe
    Advertise with Us

    Ready to get started?

    Get our media kit