Social media influences mainstream media

In April 2022, tech billionaire Elon Musk tried to buy Twitter, saying the social media company needed to be personally transformed. Significantly, the founders of PayPal, Tesla and SpaceX argued that he wanted to restore freedom of speech on the platform. Many have mentioned since then – exactly – that Musk’s track record with free speech is problematic to say the least. Nevertheless, there is another reason why the acquisition of Mask on Twitter could jeopardize democracy: by controlling the platform, the self-described “free speech absolutist” will also influence the mainstream media agenda.

Many recent studies have shown that social media has changed society (e.g. Fujiwara et al. 2021, Levy 2021). But the power of Twitter goes beyond its impact on its users. In a new research project, relying on nearly two billion tweets and an innovative empirical approach, we measure long-held suspicions – which affect the production and editorial decisions of Twitter publishers (Cage et al. 2022).

To do this, we proceed in three steps. First, we collect a representative sample of all tweets produced in French between August 2018 and July 2019 and combine it with content published online in all mainstream media outlets (including newspapers, television channels, radio stations, pure online media and news). Dispatch of the organization). Our dataset, which contains about 1.8 billion tweets, includes about 70% of all tweets in French (including retweets) during this period. Figure 1 Daily distribution plot number of tweets.

Figure 1 Number of tweets in the sample distributed daily

Comments: The image plots the number of daily tweets included in our dataset. The red line plots all tweets, the blue dotted line shows these tweets after we apply the filter, and the green dashed line shows only the actual tweets. Duration is 18 June 2018 – 10 August 2019 Some days without information when the server crashed due to a rare event and so we could not capture tweets in real time.

For each of these tweets, we collect information about their ‘success’ on Twitter (likes, number of comments, etc.), as well as user profile information when tweeting (such as the number of followers). To create this unique dataset, we’ve combined sample and filter Twitter application programming interfaces (APIs) and selected keywords. Figure 2 summarizes our data collection setup.

Figure 2 Image of our experimental setup to select the best tweet collection method

Second, we create fancy algorithms to identify all the ‘news stories’ covered in both social and traditional media. One event here is a cluster of documents (tweets and media articles) that discuss the same news story. So, for example, all documents (tweets and media articles) discussing the Hokkaido Eastern Iburi earthquake on September 6, 2018 will be classified as part of the same event. Events are identified by our algorithm that documents share enough semantic similarities. In short, for Twitter, our method is modeling the event detection problem as a dynamic clustering problem using a ‘first story detection’ (FSD) algorithm (see Mazoyer et al. 2022) for more details. To identify news events in stories published online by traditional media outlets, we follow Cagé et al. (2020) and describe each news article by a semantic vector (using TF-IDF) and use cosine distances to measure their semantic similarity. Used jointly with temporal constraints, we can cluster articles to form events. Finally, to bridge the gap between social media events and mainstream media events, we rely on the Louvain community identification algorithm (Blondel et al. 2008), as shown in Figure 3.

Figure 3 Graphical Presentation: Creating joint events

We identify 3,992 joint events, i.e. events that are covered on social and traditional media, of which 3,904 originally originated on Twitter.

Third, we rely on the structure of social media networks – and in particular, its user-centricity – to isolate the popularity of stories on Twitter from ‘external’ pushes (measured by the number of tweets, retweets, likes, etc.). In other words, we differentiate the popularity of stories on Twitter Independent The underlying interest in these stories. To do this, we use the vastness of our dataset to propose a fancy mechanical variable strategy: our device is the interaction between the centrality of the first Twitter users on the network (measuring computing pagerank centrality just before the event) and news pressure. Social media during the first tweet at the event. Our identification hypothesis is that once we control for the direct effect of centrality and news pressure, the interaction between user centrality and news pressure will only affect traditional news production by affecting the visibility of tweets on Twitter.

Our results are enlightening. Everything else is the same – and in particular, independently of a story’s newsworthiness – the number of tweets posted before a story’s first media article increased by 55%, resulting in an average 17% increase in story-related news articles. In other words, Twitter sets the media coverage agenda in a quantitatively meaningful way.

Why is that? First, a growing literature on the study of journalism highlights the fact that social media plays an important role as a news source. Consistent with this notion, we show that the level of influence is higher for media outlets that have a higher number of journalists with a Twitter account, pointing to the role of journalists in monitoring Twitter.

But the use of platforms as a source of journalism is not the only reason to play here. In particular, we investigate whether the level of transition between social and mainstream media depends on the business model of the outlets. For each media in our dataset, we collect information about whether Paywal is used (at the time of data collection), the features of this Paywall (such as Soft vs. Hard) and the Paywal launch date. This information is summarized in Figure 4.

Figure 4 Business models of news editors

Comments: Image reports the share of media outlets in our sample based on their online business model. 52% of our sample media do not have a payroll (“no payroll”), and 4.3% require reading paid articles based on the authenticity of an ad view (“Payment articles can be accessed by viewing an ad”). Among the outlets that have paywall, we distinguish between three models: hard paywall, metered paywall and soft paywall (“some articles are locked behind the paywall”).

We show that our level of influence is much higher for those media outlets that rely entirely or strongly on advertising revenue than those whose online content is behind a payroll (and thus mainly rely on subscriptions). For the previous ones, a 50% increase in popularity resulted in news coverage of 22.0% (no paywall), 20.3% (soft paywall) and 21.1% (‘watch-on-pay’ paywall). Up to 6.2% of the average number of outlets using a metered wall, a coefficient that is not statistically significant. In other words, Twitter influences the mainstream media because of the short-term considerations generated by advertising revenue-bearing clicks.

Although there are widespread fears that new technologies are undermining editorial quality – especially because they have led to savings in the newsroom, resulting in reduced quality of news delivery and reduced production of key content (Cagé et al. 2017) – our results suggest that they are inconsistent. Quality is deteriorating for those who cannot afford or are unwilling to pay for the news. In other words, since the content of media outlets is available online for free they are more influenced by the popularity of stories on Twitter than using Paywal, this platform increases information inequality, making voters more vulnerable to fraud (Kennedy and Morning 2019).

Also, our results – which capture the effects of changing popularity unrelated to the underlying newsworthiness of a story – suggest that social media can provide a biased signal of what readers want, which in turn could explain why, as highlighted by survey data, a population A significant portion is not interested in the news produced by the media (and thus may decide not to accept the news). Twitter users are not actually representatives of the general news-reading population. This indicates a negative impact on social media, driven by the production side, consistent with recent changes in both Guardian And New York Times The Social Media Guide, which highlights the fact that journalists rely heavily on Twitter as both reporting and response tools1 And it can distort their view of who their audience is.

Turning to news demand and using audience data, we finally show that news articles covering more popular events on Twitter do not get more views than other articles, further reflecting that journalists’ reliance on Twitter may be distorted. The information they create is more than what citizens actually like.

Whether Elon Musk will actually buy Twitter remains an open question. Whether new European regulations such as the Digital Markets Act (Crémer et al. 2022) and the Digital Services Act will be effective in controlling content on social networks has not yet been proven, although the DSA is a step in the right direction. In the meantime, it is vital to remember that social media is important for democracy beyond what one might expect. In fact, it not only affects users who spend time on platforms, but also has a contagion from social to mainstream media. This transition has cast doubt on the business model of the legacy media as well as the welfare effects of the platforms. In particular, our results question whether citizens would be better informed in the absence of Twitter and whether social media could be detrimental to both journalism and democracy.


Blondel, VD, JL Guillaume, R Lambiotte, and E Lefebvre (2008), “Fast Unfolding of Communities in Large Networks”, Journal of Statistical Mechanics: Theory and Experiment 2008 (10): P10008.

Cagé, J, N Hervé and ML Viaud (2017), “The Commercial Value of News in the Internet Age”,, 19 June.

Cagé, J, Nicolas H, and ML Viaud (2020), “Online Information Production”, Review of Economic Studies 87 (5): 2126–64.

Cagé, J, N Hervé, and B Mazoyer (2022), “Social Media Influence Mainstream Media: Evidence from Two Billion Tweets”, CEPR Discussion Paper No. 17358.

Crémer, J, D Dinielli, A Fletcher, P Heidhues, M Schnitzer and F Scott Morton (2022), “The Digital Markets Act: An Economic Perspective on the Final Negociation”,, 11 February.

Fujiwara, T, K Muller, and C Schwarz (2021), “The Impact of Social Media on Elections: Evidence from the United States”, NBER Working Paper No. 28849.

Kennedy, PJ, and A. Pratt (2019), “Where will people get their news?”, Economic policy 34 (97): 5-47.

Levy, R. (2021), “Social Media, News Consumption and Polarization: Evidence from a Field Experiment”, American Economic Review 111 (3): 831-70.

Mazoyer, B, N Hervé, C Hudelot, and J Cagé (2022), “Short-Text Embedding for Unsupervised Event Detection in a Stream of Tweets”, Knowledge discovery and management progress 10, upcoming.


1 -Sometime /

Leave a Reply

Your email address will not be published.