On Becoming Accidental Disinformation Researchers: The Case of the Curious CDC Tweet
Our research team at the University of Colorado Boulder has launched a series of posts that reveal how influential the @realDonaldTrump account was leading up to and during the 45th US Presidency. Like no other person online, this account had tremendous impact on not just its 88M followers — but also on the online information environment in the large.
This post is the fifth in a series we are running in winter/spring 2021.
On April 25, 2020, @realDonaldTrump retweeted the CDC 9 times within 6 minutes and 18 seconds.
In March, not knowing how long the pandemic would last, we initiated a first-stage study to capture what risk communication looked like in a public health disaster, thinking that we would compare and contrast the traditional weather and geologic hazards we typically study. Recall, for example, how popular the campaign of “flatten the curve” had become. Our data collection and analysis methods are well-practiced, so we set some preliminary quantitative and qualitative approaches into action. Crisis informatics research always takes us in directions we don’t expect, as each disaster's conditions are unique. Therefore, we always take a breath, tell ourselves to “begin at the beginning,” and prepare to direct our gaze at emergent phenomena in the information ecosystem as each event unfolds.
In this case, we turned our gaze to the US Centers for Disease Prevention and Control (CDC) outgoing Twitter communications, expecting this to be the first methodological “lily pad” from which we would leap to pursue emergent and problematic issues as they came into view.
But no. In the routine, careful, even staid communications of the CDC and its affiliated accounts, we got stopped in our tracks. Some unusual retweeting behavior revealed patterns of diffusion we were not expecting to see in late April, even for the most mundane of novel coronavirus risk messaging precautions.
In collecting the official @cdcgov communications, we were also following their related accounts, including tweets posted by @cdcdirector, @cdcespanol, @cdcemergency, @cdc_ehealth, and @cdcglobal. We were also collecting tweets whenever another Twitter account mentioned any of the CDC accounts and, most importantly, whenever someone retweeted one of these CDC accounts. Previous work from our research group has used these retweeting rates as signals for the rate of information diffusion.
Our data collection was humming along, receiving a few hundred tweets per hour, until the morning of April 25, when the collection surged to more than 12,500 tweets in one hour (Figure 1). What happened?
We discovered that @realDonaldTrump retweeted not just one but nine different tweets posted by CDC accounts between 9:01:56 AM and 9:08:14 AM Eastern Time. Each of these tweets had been posted by @CDCgov in the previous days (before April 25), and each had already had a few hundred retweets, which was a typical diffusion pattern for their messages. However, immediately after @realDonaldTrump retweeted the select posts, thousands of other accounts also retweet in the following hours.
Why did @realDonaldTrump suddenly enter the CDC communications scene on April 25 to promote important but rather mundane risk messaging?
On April 23, former President Trump held a press conference during which he suggested that “medical doctors” consider injecting disinfectants or using ultraviolet light to kill the virus in the human body. The next morning, seemingly in reaction to this announcement and the attention that it received in the media, @CDCgov posted the following tweet:
Note that there is no mention of the coronavirus in this tweet, and so it is social media data that would not have been collected in a pandemic-only Twitter search, such as what the Twitter covid19 endpoint provides. Our mentioning this is because behaviors of interest often occur well outside the terms data scientists would know to collect. The additional implication is that information diffusion is difficult to study because so many features of what makes information pertinent to the story can be easily overlooked.
On April 25, two days after the press conference and one day after the CDC warned about the unsafe use of disinfectants, @realDonaldTrump then retweeted what we call the CDC “bleach tweet” at 9:01:56 AM Eastern. This tweet was already receiving higher than normal levels of diffusion before @realDonaldTrump retweeted it. But once he also retweets, diffusion then went through the roof. While we cannot know the motivations for the former President’s participation in this online conversation—was he trying to show he was “in” on the joke at the same time he was publicly stating he had offered the advice in jest?—what happened next was a frenetic spurt of retweeting in which @realDonaldTrump retweeted 8 more tweets from @CDCgov (Figure 3).
What were these other 8 @CDCgov tweets that @realDonaldTrump retweeted?
One may speculate about this reason for new interest in the CDC. Perhaps he was trying to normalize his first Twitter visit into the CDC communications (where he had not previously been) with the “bleach tweet” by showing additional presidential interest in their other messages that were immediately available. Or, perhaps the rapid tweeting was a method of burying the “bleach tweet” further down the timeline of the @realDonaldTrump Twitter account upon his reconsideration of whether he should have retweeted it in the first place. Of note, @realDonaldTrump did not retweet @CDCgov again until May 1 (once) and then May 24 (twice), so there was no consistent behavior of presidential amplifying the @CDCgov messages each morning.
Regardless of the motivations of @realDonaldTrump to furiously retweet @CDCgov multiple times on April 25, the effect it had on the Twittersphere dramatically altered the course of our research on this pandemic. The reader’s examination of a snapshot of our data collection visualization might reveal why (Figure 4):
Note the similarities of the diffusion curves of the @realDonaldTrump-intercepted tweets. The micro valleys and rises are the same. The skips in between retweets are echoed across all curves. When we zoom in to any level of temporal scope, we see highly similar patterns.
For the eye to see the similarities is a remarkable thing, but we verified this with statistical analyses as well. We could show that even the order of subsequent retweeting accounts was the same across the diffusion curves with statistical analysis.
What does this mean? We believe we saw a refracted signal of the machine- and human- automated accounts that follow @realDonaldTrump retweeting whatever that account tweeted, no matter the content. We explained this phenomenon to the Washington Post just before the 2020 Election. Had we just found an elegant way to identify bots or, at a minimum, @realDonaldTrump’s most ardent followers? Figure 5 below features ~1700 of these identified accounts and their interconnectedness in a network graph.
The content of the “bleach tweet” does not favor the former President. These retweeting accounts — those that upon further inspection regularly amplify @realDonaldTrump and often have “#maga” or other similar symbols in their profiles— nevertheless propagated the bleach tweet, suggesting that these accounts do not critically engage with tweet content before retweeting.
In the next post, we discuss the possible answers to the implications of the amplification machine in greater depth. We will begin to paint a picture of the peerlessness of @realDonaldTrump’s online influence and what this means for understanding how processes of information diffusion do not spawn from the content of messaging alone, but also the social structure that lies underneath those “amplification machines.”
For now, here’s the network graph visualization of the ~1700 accounts we found by examining the @realDonaldTrump retweeted CDC tweets, accounts that appear to compose the core of an amplification machine that we will deconstruct in future posts (Figure 5):
Notes
- For those curious about the two popular tweets that @realDonaldTrump did not intercept that are depicted in the lower right of Figure 4 with high retweet activity in later April: the tweet in red sent on April 27 that gained over 1,000 retweets by April 29 contained information suggesting that people could be contagious well before expressing symptoms. Additionally, the cyan tweet posted on April 25 that also gained 1,000 retweets in just two days discusses how to properly wear a cloth face covering.
- An important detail about Twitter: You cannot retweet a retweet. When an account retweets something that @realDonaldTrump retweeted, they are, in fact, retweeting the original @CDCGov tweet, not the retweet, even though it was likely the @realDonaldTrump retweet that came across their timeline that alerted them to the tweet in the first place. This is how we found these users.