Pinterest is on the cutting edge of visual recognition technology which matches images with other similar images accurately enough to show related items next to images that have no text at all. Coming soon is a feature that will allow Pinterest users to take a photo of an object like a purse and instantly compare that real world object with other similar ones.
Today, Pinterest introduced Automatic Object Detection for its most popular categories, allowing visual based searches for products in a Pin's image.
"As we look to the future of visual search, we’re also starting to preview new camera search technology that’ll give Pinners recommendations for the products they find in the real world," stated Dmitry Kislyuk, a software engineer on Pinterest's Visual Search Team. "Pinners will soon be able to snap a photo of a single object like sneakers - and get recommendations on Pinterest, or even take a photo of an entire room and get results for multiple items."
Many people don't realize the extent of R&D that Pinterest has invested and the intensity of their focus on visual recognition and visual search research. "Visual search is one of the many fields transformed in recent years by the advances in deep learning," Kislyuk said. "Convolutional neural networks represent images and videos as feature vectors which preserve both semantic concepts and visual information, and allows for fast retrieval when using optimized nearest neighbor techniques."
Pinterest's Kislyuk elaborated in a blog post:
"We leveraged this idea, along with our richly annotated image dataset, last November when we released a visual search product that makes searching inside a Pin’s image as simple as dragging a cropper. For our initial launch, we extracted the fully-connected-6 layer of a fine tuned VGG model over a billion Pinterest images and indexed them into a distributed service, as described in our KDD paper."
The goal is to use automatic object detection in order to make visual search a seamless experience on Pinterest. Detecting objects in visual search allows Pinterest to do object-to-object matching. Then, if you see a chair you like at a store or someones house you or you find that perfect chair on Pinterest you will be able to view it in various decorative home settings.
Building automatic object detection
Our first challenge in building automatic object detection was collecting labeled bounding boxes for regions of interest in images as our training data. Since launch, we’ve processed nearly 1 billion image crops (visual searches). By aggregating this activity across the millions of images with the highest engagement, we learn which objects Pinners are interested in. We aggregate annotations of visually similar results to each crop and assign a weak label across hundreds of object categories. An example of how this looks is shown in the heatmap visualization below, where two clusters of user crops are formed, one around the “scarf” annotation, and another around the “bag” annotation.