“I envision a future where everything will be captioned, so the more than 300 million people who are deaf or hard of hearing like me will be able to enjoy videos like everyone else,” said Liat Kaver, a YouTube Product Manager focusing on captions and accessibility. “When I was growing up in Costa Rica, there were no closed captions in my first language, and only English movies had Spanish subtitles. I felt I was missing out because I often had to guess at what was happening on the screen or make up my own version of the story in my head. That was where the dream of a system that could just automatically generate high quality captions for any video was born.”
As YouTube grew, so did the number of videos with captions which now stands at over 1 billion. Kaver says that more than 15 million videos are watched each day with captions enabled.
“One of the ways that we were able to scale the availability of captions was by combining Google’s automatic speech recognition (ASR) technology with the YouTube caption system to offer automatic captions for videos,” says Kaver. “There were limitations with the technology that underscored the need to improve the captions themselves. Results were sometimes less than perfect, prompting some creators to have a little fun at our expense!”
Kaver says that one of their teams major goals has been to improve automatic caption accuracy via technological improvements in speech recognition, machine learning and increases in training data. “All together, those technological efforts have resulted in a 50 percent leap in accuracy for automatic captions in English, which is getting us closer and closer to human transcription error rates,” she says. “I know from firsthand experience that if you build with accessibility as a guiding force, you make technology work for everyone.”