top of page

Predicting Virality with Network Mapping

From early work with Gephi during my consulting stint, it was clear to me that it was possible to identify emerging viral content from unexpected sources, particularly low follower accounts that happened to say something that struck a chord with a broad community. The raison d'être was to give brands the ability capitalize on organic virality by injecting their message into emergent conversations at the right place and time with the right content.

Foundational Insight

Changes in network map values such as PageRank and Betweenness Centrality could predict the virality of a user/post, regardless of follower count and provide real-time insights into the perceptions of trending topics across different communities. 

​

Here's an early Google Sheets based prototype that includes the dataset of the image to the right

​

​

emerging virality.png

The Build

The MVP comprised a PHP/JS/Sigma.js front-end & two primary Python classes: Streamer and Grapher. Streamer is the API-facing class that collects posts then parses them into nodes and edges of users & tweets as well as their associated links, hashtags, and media. Grapher instantiates every 5 minutes, creates an updated network map, and calculates PageRank, centrality measurements, and group number. I contracted a full-stack dev to build out the front-end and help me tie it all together. 

​

The objective was to track topological changes in the network map as a means to track and predict virality. We had to innovate several solutions throughout this, as maps would get computationally bulkier the longer a Streamer instance was running. 

  • Used Networkit which is a C++ based network mapping libraries chosen for calculation speed

  • Created graph trimming functions to keep the map at a manageable size

  • Set display limits in front-end to prevent browser crashing.

Sunsetting

Conversation mapping was the first beta MVP of Mainline. We tracked live events, world events, and industries, working with a variety of clients in each. We found a few important lessons:

  • Virality was detectable but authenticity was not -- a lot of viral content was fueled by coordinated activity. 

  • Brands didn't want to predict virality so much as drive it for their own purposes. 

  • The Maps UI had great curb appeal but was difficult for non data-natives to use effectively.

  • Twitter's business strategy under Musk led to many API changes, including closing the streaming endpoints we were using.

​

While we did apply for a patent for this type of novel visualization, these lessons led us to a radical pivot to focus instead on creating algos that could identify accounts with the most authentic communities that have the most spread. 

bottom of page