New MIT Algorithm Predicts Twitter Trends Hours in Advance
Researchers at the Massachusetts Institute of Technology (MIT) have announced a new algorithm they say is capable of predicting Twitter trends far in advance.
The algorithm is claimed to predict with 95% accuracy the topics that will show up on Twitter’s trending topics list. It can make these predictions an average of an hour and a half before Twitter lists the topic as a trend, and can sometimes predict trends as much as four or five hours in advance.
Devavrat Shah, associate professor in the electrical engineering and computer science department at MIT, and MIT graduate student Stanislav Nikolov will present the algorithm at the Interdisciplinary Workshop on Information and Decision in Social Networks in November.
Shah stated that the algorithm is a nonparametric machine-learning algorithm, meaning it makes no assumptions about the shape of patterns. It compares changes over time in the number of tweets about a new topic to the changes over time seen in every sample in the training set. Also, training set samples with statistics similar to the new topic are more heavily weighted when determining a prediction. Shah compared it to voting, where each sample gets a vote, but some votes count more than others.
This method is different from the standard approach to machine learning, where researchers create a model of the pattern whose specifics need to be inferred. In theory, the new approach could apply to any quantity that varies over time (including the stock market), given the right subset of training data.
For Shah and Nikolov’s initial experiments, they used data from 200 Twitter topics that were listed as trends and 200 that were not. “The training sets are very small, but we still get strong results,” said Shah. In addition to the algorithm’s 95% prediction rate, it also had only a 4% false-positive rate.
The accuracy of the system can increase with additional training sets, but the computing costs will also increase. However, Shah revealed that the algorithm has been designed to execute across separate machines, such as web servers. “It is perfectly suited to the modern computational framework,” said Shah.
“It’s very creative to use the data itself to find out what trends look like,” said Ashish Goel, associate professor of management science at Stanford University and a member of Twitter’s technical advisory board. “It’s quite creative and quite timely and hopefully quite useful.
“People go to social-media sites to find out what’s happening now. So in that sense, speeding up the process is something that is very useful.”
(Image courtesy MIT)