This morning we brought you the story of Andrew Ng and the 16,000-core neural network that was able to decipher on its own what a cat is from thousands of pictures taken from YouTube videos. This afternoon, Google has chimed in and provided some background on why it is funding machine learning research.
Over on the Google Official Blog, Google Fellow Jeff Dean and Andrew Ng, director of Stanford’s artificial intelligence lab, have outlined how they believe their research will benefit both Google and the world. From the blog post:
You probably use machine learning technology dozens of times a day without knowing it—it’s a way of training computers on real-world data, and it enables high-quality speech recognition, practical computer vision, email spam blocking and even self-driving cars. But it’s far from perfect—you’ve probably chuckled at poorly transcribed text, a bad translation or a misidentified image. We believe machine learning could be far more accurate, and that smarter computers could make everyday tasks much easier. So our research team has been working on some new approaches to large-scale machine learning.
It’s easy to see what applications better machine learning could have for Google. Better voice recognition, image search, or even regular search would certainly help the company’s core business. A more “intelligent” could also help move Google past the treadmill of its Algorithm updates. Perhaps even an “intelligent” info overlay using Google Glass is in the future.
What Ng and the engineers at Google X are doing is creating neural networks that can teach itself to recognize objects using unlabeled data. Currently, machines are taught using content labeled “cat,” for example. The Google X experiment built a large neural network that simulates small-scale human brain architecture. The Google X team wanted to know what it would recognize after showing it YouTube videos for a week.
Our hypothesis was that it would learn to recognize common objects in those videos. Indeed, to our amusement, one of our artificial neurons learned to respond strongly to pictures of… cats. Remember that this network had never been told what a cat was, nor was it given even a single image labeled as a cat. Instead, it “discovered” what a cat looked like by itself from only unlabeled YouTube stills. That’s what we mean by self-taught learning.
The picture above represents what the computers consider a cat to be. Dean and Ng stated that the real goal of the project is to develop machine learning systems that are scalable so that “vast sets of unlabeled training data” (such as the internet, perhaps?) can be accurately classified. The results of the cat experiment will be presented this week at the International Conference on Machine Learning in Edinburgh, Scotland.