CDS Guest Editorial: Information Tracer, a proactive framework to fight COVID-19 infodemic

This entry is a part of the NYU Center for Data Science blog’s recurring guest editorial series. Zhouhan Chen is a CDS PhD student.

Zhouhan Chen, CDS PhD Student

Infodemic is a term that describes the spread of misinformation on a massive scale. It is an old word, because there were rumors and conspiracies back in the Middle Ages. It is also a new term, because today’s infomedic is more sophisticated, coordinated, and contagious with the help of the Internet.

In fact, not long after the outbreak of COVID-19 in 2020, the World Health Organization (WHO) expressed concerns about a COVID-19 infodemic, including the proliferation of misinformation (or fake news stories) that is constantly shared on social media platforms.

Sharing fake news links from one platform to another is as simple as a click. In the past few years, researchers have been trying to draw analogies between the spread of viruses and the cross-platform spread of misinformation. Several key questions include: which social media account is the “patient zero” to share misinformation, which platform acts as a “super-spreader”? What is the “virality” of a piece of misinformation? And how do we even define “virality”?

To answer those questions, CDS PhD student Zhouhan Chen, in collaboration with NYU Center for Social Media and Politics (CSMaP), created Information Tracer, a system to collect, quantify and discover information operations across major social media platforms including Twitter, Facebook, Reddit, and Youtube.

To trace the spread of information, the system takes a URL as input, and uses various APIs (including Crowdtangle and Twitter Academic API) and crawlers to collect posts that match the input URL. To quantify information spread, Zhouhan designs and operationalizes multiple numerical metrics. One metric is “Breakout Scale”, which measures the number of platforms a URL percolates through. A high breakout scale indicates a more widely-shared URL. The metric was originally proposed by Ben Niemo to understand information operations. Using “Breakout Scale” and other metrics that detect Twitter Traffic Manipulation, Zhouhan identified suspicious accounts on Twitter that persuade users to watch YouTube videos that falsely claim that coronavirus is caused by the 5G network. The main account is still actively posting unverified claims as of August 1, 2021.

As a next step, Zhouhan is making Information Tracer more customizable. The goal is to enable researchers to turn raw data into actionable insights in a few minutes. Users can now create their own metrics, set thresholds, and monitor potential information operations. For more detail, please check out his paper “An Automatic Framework to Continuously Monitor Multi-Platform Information Spread”, visit Information Tracer, or contact Zhouhan for specific use cases.

By Zhouhan Chen

Official account of the Center for Data Science at NYU, home of the Undergraduate, Master’s, and Ph.D. programs in Data Science.