ENSPIRING.ai: MIT Sloan Experts Series Sinan Aral - The Truth About Fake News
The video explores the transformative effects and challenges of fake news, particularly focusing on its proliferation on social media platforms like Twitter. Initially investigated due to curiosity about information veracity during the Boston Marathon bombing, Professor Sanana Raul and his MIT Sloan team have now conducted a comprehensive study covering data from 2006 to 2017. With input from behavioral economists and political scientists, the study examines the dynamics of false news, comparing it to true news and revealing that false information spreads further, faster, and more broadly.
The influence of social media in amplifying false news is discussed in the context of how new technologies impact the dissemination of information. The spread of false political news is highlighted as particularly pervasive, prompting discussions on the roles of human actors versus bots in this phenomenon. Although bots do play a role in distributing misinformation, human behavior is identified as the predominant factor in the spread of false news.
Main takeaways from the video:
Please remember to turn on the CC button to view the subtitles.
Key Vocabularies and Common Phrases:
1. salacious [səˈleɪʃəs] - (adjective) - Having an excessive or inappropriate interest in sexual matters. - Synonyms: (lewd, lascivious, obscene)
Today we're talking about fake news, a phenomenon that's evolved from a salacious Internet sideshow to a serious threat to our electoral system.
2. veracity [vəˈræsəti] - (noun) - Conformity to facts; accuracy or truthfulness. - Synonyms: (truthfulness, accuracy, authenticity)
The way that this all started was that my colleagues and I were doing research about veracity and veracity detection in social media.
3. spur [spÉœr] - (noun) - A stimulus or incentive that encourages a reaction. - Synonyms: (stimulus, incentive, catalyst)
It comes in spurts and fits, it goes up, it goes down, but the trend is generally increasing.
4. novelty [ˈnɒvəlti] - (noun) - The quality of being new, original, or unusual. - Synonyms: (originality, freshness, uniqueness)
So the next hypothesis that we looked at was what we call a novelty hypothesis.
5. cascades [kæˈskeɪdz] - (noun) - Sequences of extensive information flow, likened to a waterfall. - Synonyms: (sequence, series, flow)
Then we worked backwards and went to Twitter and found instances of people talking about these stories, pointing to them or mentioning them in their tweets. And we recreated one by one the retweet cascades of each diffusion event of a story spreading from person to person.
6. corroborate [kəˈrɒbəreɪt] - (verb) - To confirm or give support to a statement or idea. - Synonyms: (confirm, verify, substantiate)
So the surprise element corroborates the novelty hypothesis, but the disgust element, there's something uniquely human about being disgusted by something that you're reading online.
7. algorithm [ˈælgəˌrɪðəm] - (noun) - A process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer. - Synonyms: (procedure, formula, method)
The thing that makes me super excited is that we have really figured out algorithmically how to amplify human cognitive bias.
8. vetted [ˈvɛtɪd] - verb (past participle) - Having been thoroughly checked or reviewed for accuracy and reliability. - Synonyms: (examined, scrutinized, verified)
And it's very easy to simulate things that look just as plausible as what used to be considered, you know, vetted information.
9. paralyze [ˈpærəˌlaɪz] - (verb) - To render unable to move or function effectively. - Synonyms: (immobilize, freeze, cripple)
And I do think that if we allow our society to be consumed by falsity, it's going to paralyze us in a number of ways that we really cannot comprehend currently.
10. disincentive [ˌdɪsɪnˈsɛntɪv] - (noun) - A factor, especially a financial disadvantage or negative consequence, that discourages a certain action. - Synonyms: (deterrent, discouragement, counter-incentive)
Well, if it's currently incentivized, then can it be disincentivized? Can we create disincentives for creating and spreading false news? For instance, can we demote accounts or stories that are determined to be false or less than credible? Can we reduce the flow of advertising dollars in that direction? For instance, Procter and Gamble last summer, you know, Mark Pritchard, who's the CMO of Procter and Gambler, had said, look, there's not enough transparency in online advertising
MIT Sloan Experts Series – Sinan Aral - The Truth About Fake News
Welcome to the latest edition of the MIT Sloan Expert series, which provides an inside look at some of the most exciting new ideas and research coming out of MIT Sloan. I'm your host, Rebecca Knight. Today we're talking about fake news, a phenomenon that's evolved from a salacious Internet sideshow to a serious threat to our electoral system. Joining me to talk about that today is Sanana Raul. He's a professor here at MIT Sloan and his new research on the topic is published in the latest issue of Science. Thanks so much for joining us. Pleasure to be here. Before we begin, I want to remind you, our viewers, that you too can join the conversation. Please use the hashtag MitsaloanExperts on Twitter to chime in.
Let's get started. So, Sanana, the term fake news has really, it's really become an all purpose smear that's used by pundits and politicians, including our president, to describe journalism they don't like. What got you interested in studying this from a research perspective? Yeah, so in fact, we started studying this phenomenon before it was the media rage two years ago or even more than two years ago. We worked directly with Twitter to get access to the full Twitter firehose and the historical record of tweets so that we could really comprehensively and for the first time analyze the spread of false news online.
The way that this all started was that my colleagues and I were doing research about veracity and veracity detection in social media. And this was right around the time of the Boston Marathon bombing here at mit. And for those of us who were in and around mit, for the Boston Marathon bombing, when the bombing happened, there was a lockdown at mit. People were asked to stay inside and were supposed to stay where they were. And it was very difficult to get news about what was happening and where it was safe to go and where the authorities were directing people or investigating and so on. And so as we do today, we turn to social media to try and understand what was happening. And our PhD student at the time, Surush Vasugi, who's the lead author of this study, was interested in and trying to understand what of what I'm hearing on Twitter is actually real and what of it is misinformation or false information or essentially wrong? And how could I understand what's happening and know where it is to be safe on campus?
If you remember, tragically, Sean Collier, a police officer here at mit, was shot and killed during this event. So it wasn't safe necessarily to be on campus. And people wanted to know where to go. When Soroush saw that some of this news was not really accurate, it stood started all of us thinking about, well, how does this type of news spread and what are the potential consequences of that? And this happened a long time before fake news was a thing in the media. Fast forward to today. And fake news is a term you can't go through the morning news. It was named the word of the year by Collins dictionary this year. Absolutely. So it makes perfect, perfect sense that you would see a number of different uses of this term in a number of different settings.
And so, as we describe in the paper, the term fake news has evolved over time. Whereas initially it might have meant stories that are false or faked intentionally, now it is also used as a political strategy. To label something as fake news that a politician may not like in terms of its reflection on them is a strategy for dampening the value or the effect of that information. And so now the term fake news has a lot of different meanings to a lot of different people. So in the paper, we stay away from the term fake news because it's been so popularized and publicized and politicized, frankly. And we focus on the term false news, which is reference to news that has been verified to be true or false by some independent fact checking organization.
So in terms of the false news, I mean, fake news, false news, we actually don't know that much about it. Behavioral economists and political scientists are. This really is the topic du jour. And we don't know who's reading what, whether or not targeted fact checking efforts are actually having an impact. I mean, where do you think that we're going with this? Well, frankly, we're at the very beginning. We're at ground zero for understanding how false news operates, how it's produced, how it spreads. It's also a moving target. The way that it's produced, the way it's spreading, is changing day to day. So we have a very difficult task ahead of us.
But you're correct in saying that until now, largely most of the science has been on studies of single news stories spreading. For instance, a news story about the Boston marathon bombing or the discovery of the Higgs boson or the Haitian earthquake, or it's been studies of very small samples of Twitter posts or Facebook posts and so on. So what we set out to do in this study is to comprehensively analyze all of the false news that has ever spread on Twitter from its inception in 2006 through to 2017, including the presidential election cycle of 2016, where a lot of the attention has focused and to our knowledge, it is the first large scale and certainly to date the largest longitudinal study of the spread of false news online.
I want to unpack those findings, but first I want to talk to Tim O'Reilly. The MIT Sloan Expert Series recently sat down with Tim. He is the founder and CEO of O'Reilly Media and the author of several books including what's the Future and why it's up to Us. Here's what Tim had to say about the problem of false news. Well, I think the fundamental problem that we have today is that we have a medium that we don't fully understand. That is social media has completely changed the landscape of how people receive information, how they pass it on, how they interpret it. And there's, I think, some pretty big challenges that come from the fact that many of the traditional signifiers that we've used to, you know, connote authoritative vetted information are absent.
And it's very easy to simulate things that look just as plausible as what used to be considered, you know, vetted information. And so I think that the issue of scale and speed with which misinformation spreads, I think is the really key problem. But of course, this is not entirely new. The incentives in our media system that have changed. You know, it used to be that we had this base of local subscribers and you had to serve that base and you built a reputation with them. And now it's basically almost entirely ad based media. And ad based media chases the clicks, chases the attention. And that's always been the case on the ad side is it panders to sensationalism.
We have really come to understand what cognitive buttons to push. And there's a whole science in Silicon Valley of how do you manipulate people, how do you get them to do things that they don't want to do? I mean, we have basically figured out how to hack people's brains for profit. And now we're starting to understand people are saying, well, you don't just have to do it for profit, you might do it for political ends, for example. And our traditional media needs to recognize the self destructive behavior. Our politicians need to recognize the self destructive spiral they're in of algorithmically trying to basically purvey false stories to get elected, trying to manipulate electoral maps so they'll get elected rather than going back to say what were we really trying to do? We were trying to have a real debate about ideas and have persuade people about the truth.
When did we forget that that was Tim O'Reilly of O'Reilly Media? We're going to hear from him again a little later in the program. Sanant, you're the author of a forthcoming book, Hype Machine, about how social media is upending our democracy. What's your biggest concern with regard to fake news? Yeah, so I've been studying how social media is affecting our society for the last 15 years or so. And this book is an attempt to synthesize the scientific findings about how social media is affecting our democracy, our businesses, our politics, and even our public health.
And so the purpose of the book is to explain how social media has the potential to create tremendous promise for our society as well as the potential to create tremendous peril and how we can navigate towards the promise and avoid the peril. It will talk about fake news, Russian meddling, but it will also talk about how we influence each other's dietary habits, how we influence each other's exercise behaviors, how social media advertising works or doesn't work in a comprehensive way to try and understand how social media is affecting society.
So I want to talk now about this study. So what were you exactly measuring? What was the data you were looking at? So we studied all of the true and false verified stories that ever spread on Twitter from 2006 to 2017, 11 years of data. We tracked 126,000 stories that were spread by millions of people millions of times over Twitter. And the way we did this was we went to six independent fact checking organizations like Snopes, politifact, Fact Checker and so on. And we looked at all of the stories that they ever investigated as being either true or false. So these fact checking organizations take the time and energy to go through and really find out.
And it's humans who are doing the fact checking. Absolutely. This is in a sense, investigative efforts by teams of human beings trying to understand what's true and what's false. And luckily, from our perspective, These organizations agreed 95 to 98% of the time about their labels about whether something was true or false. So we had a very nice corpus of 126,000 cascades that were labeled as true or false by these independent fact checking organizations. Then we worked backwards and went to Twitter and found instances of people talking about these stories, pointing to them or mentioning them in their tweets. And we recreated one by one the retweet cascades of each diffusion event of a story spreading from person to person to person person on Twitter.
So for a given story, if I was to tweet it and you were to retweet it, and then your friend Was to retweet it, we would be able to see that cascade timestamped second by second as that information was moving from one person to the next to the next to the next. So for each of these 126,000 cascades, we created these time lapse data sets of who they were touching, how they were being spread, and so on. Then we used data analytic tools to measure how far, how fast, how broadly each of these stories was diffusing. And then we began to compare. Well, how does the truth diffuse differently than falsity? How does a false tweet look different in the socio technical system of Twitter in terms of who sees it, how often they see it, how quickly the information spreads, how many people see it, and so forth.
And what did you find? Well, frankly, the results were disturbing and at times very surprising. So the primary result is that false news travels farther, faster, deeper, and more broadly than the truth in every category of information on Twitter by an order of magnitude in some cases. So it's not even close in terms of how much faster and farther false information is traveling in social media compared to the truth. And why do lies travel so much more quickly than the truth?
Well, in addition to looking at all of the different information, we also broke this down by category. So for instance, we looked at false and true news, or any kind of news that was related to terrorism or natural disasters or politics, business, urban legends, science, entertainment, and so on. And what we found, for instance, was that false news is increasing over time. So as you look at the temporal progression of false news, every day, every year, there's progressively more false news. It comes in spurts and fits, it goes up, it goes down, but the trend is generally increasing. We also found that false political news traveled farther, faster, deeper and more broadly than any other type of false news, which gives us some sense into the why. Because the topics matter. When we're talking about politics, the false news is traveling much further and much faster and much farther.
So we wanted to understand what was driving these results. And the first thing that came to our mind, and one might suspect, for instance, that the characteristics of the people involved in spreading the news is predictive of which news is going to travel farther, faster, deeper and broader. Right. Perhaps these people are verified Twitter users or influential people anyway, they're pundits. They've got a big fight. Exactly. So that was the first sort of hypothesis that we had, is that, yeah, maybe the people spreading it are somehow different. And that favors the spread of false news. So we looked at whether people, how Many people, how many followers a Twitter user had, how many followees, how many people they followed, whether they were verified, how long they'd been on Twitter, how active they were on Twitter.
And completely to our surprise, and contrary to what one might expect, all of these things favored the spread of true news, not false news. So, in other words, people who spread false news had fewer followers, followed fewer people, were less active on Twitter, were less often verified, and had been on Twitter for less time than people spreading the truth. So the conclusion from that analysis was that false news was spreading farther, faster, deeper, and more broadly despite these characteristics, not because of them. So at that point, we were confused, because if these things don't explain the spread of false news, then what does?
So the next hypothesis that we looked at was what we call a novelty hypothesis, which is explain to our viewers what that means. Yeah. So essentially, if you look to information theory or Bayesian decision theory, what you find in that literature is that there's lots of evidence that novelty attracts human attention, is something that we're more likely to value because it's explaining more about the world that we don't know. Typically, it's new, it's surprising, it's different, it's new, it's surprising, it's different. Different. And people who share novelty are thought to be in the know or have access to inside information. So it increases their social status to share information that other people don't know about gives them some currency on gives them some personal currency about what they know and what kind of expertise they might have that others don't have.
So we thought, well, maybe novelty is a potential explanation for what's driving the spread of false news. So we built some models that predicted the likelihood of retweeting a piece of information, given its characteristics. And what we found was that controlling for how many followers, followees, whether you're verified, your activity level, and how long you've been on Twitter, that false news was 70% more likely to be retweeted by any given individual than true news. And then we measured how novel the false news was, and the true news was compared to everything you had seen on Twitter in the 60 days prior to receiving one of these true or false news tweets. And what we found was that across three different independent measures, false news was dramatically more novel than true news. Which makes some sense, because when you're unconstrained by reality, you can pretty much make up anything you want. And it's easy to be surprising and novel when you don't have to Be factually accurate in what you're saying. So we found that people were more likely to retweet novel information, and false news was overwhelmingly more novel than the truth.
And that false news was 70% more likely to be retweeted than. Than true news. So false news, as you said, spreads faster, further, more significantly, more broadly than true news. And then one of your other key findings is about who or what is actually spreading the news. Bots get a lot of the credit for doing this, but you actually found out something different. Yeah. So in fact, when we were looking at the novelty hypothesis, we also wanted to examine replies to the. To the false and true tweets. And what we found there was essentially that in the replies to a true or false tweet, people were expressing very different emotions about that tweet. So when they encountered a false news tweet, they would express surprise and disgust significantly more than they would after the 2016 election.
That does not surprise you. Exactly. And so the surprise element corroborates the novelty hypothesis, but the disgust element, there's something uniquely human about being disgusted by something that you're reading online. Obviously, bots can also fake being disgusted. But that got us thinking about, well, there's so much talk about the role of automated robots in spreading this kind of information online. We have one of the first comprehensive data sets of this type of information spreading. Maybe we could analyze how responsible bots were actually for spreading this kind of information. So we used two different bot detection algorithms, state of the art algorithms, to identify what was bot activity and what wasn't.
And then we simply took the bot activity out and reanalyzed the data, put it back in and reanalyze the data. And we did this in a number of different ways to try and make sure that our results were robust. And what we found was that bots were spreading true and false news at approximately the same rate. So bots could not explain the great difference in how far and fast falsity was spreading compared to the truth. And that while they were responsible for spreading false news and true news, they were not responsible for the difference in the speed and velocity with which falsity was overtaking social media compared to the truth. So we can't be blaming the bots. So bots are a problem for false news, but they do not explain why false news is so much more quick and overwhelming on social media compared to true news.
What that means is that it's human beings that are largely responsible for the spread of false news, not automated robots. So now I want to talk about some potential solutions. We're going to hear again from Tim O'Reilly, the CEO and founder of O'Reilly Media. Here's what he had to say. I believe that there are many, many layers of solution to any human problem. There's no magic bullet. I think that their business model changes that news organizations need to make to reinforce their incentive to provide valuable information. I think there's a lot of training that we should be giving our children and ourselves in how to detect ways that we are being manipulated. But I think it's going to be a real challenge because so much of our society is based on manipulation.
Now, the thing that makes me super excited is that we have really figured out algorithmically how to amplify human cognitive bias. Could we use the algorithm instead to go the other way? What you have to do in order to build an effective algorithm is to understand the way the system you're trying to model works. When we understand better how human psychology works, we can actually make people smarter. We can make an algorithm that encourages people to be more skeptical instead of to be more credulous. We can make an algorithm that makes people respond better to things that seem to have more social value.
But what we haven't done is to step back and say, well, could we do the opposite? Could we figure out how to damp down or compensate for cognitive bias? And there are really some simple things. A good example of, you know, of something that you do in your own psychology if you're a skeptical person, is you hear some new information and it seems unlikely, and you stop and you pause to verify it. Well, we could train our machines, our algorithms, to do the same thing. Oh, here's a new piece of information. It seems unlikely. It's contrary to the general consensus.
So pause before you pass it on. It's not that the algorithm is deciding what's true. It's that humans are identifying a bunch of consistent features in the data that make it likely that something is true or false or ought to be looked at. So think back to the early airplanes, back around the turn of the 20th century, and they weren't able to fly very far. They crashed a lot. And over the period of decades, they became much more reliable. We understood the aerodynamics better. We were able to build much better designs.
I think the same thing happens here. We get better at this. We can get better at this. And it's really, I think it's fundamentally a matter of choosing what we want to get better at. Do we want to get better at Extracting more money from people, or do we want to get better at building a balanced media ecosystem? That was Tim O'Reilly. So, Sanan, he brought up a lot of. A lot of interesting points already. According to top intelligence officials, Russia is meddling once again in our elections. Where do you think we go? Do you see any policy interventions, any things that the media can do themselves to improve this?
Well, you know, let me begin by saying the potential catastrophic consequences of the spread of falsity are dramatic. For instance, they can lead to misallocation of resources during a terrorist attack. They can lead first responders to the wrong building where there are no people in a natural disaster. There's a story about a false tweet during President Obama's presidency where a false tweet claimed that he was injured in an explosion and it wiped out $130 billion of equity value in a single day. So the consequences of this stuff are real and they are dramatic.
So we have to think about what are the platform design, policy, regulatory and other types of interventions that can help us deal with this kind of catastrophic consequence that false news could potentially engender. So the first thing I would say about how to move forward is we do not know enough about what's going on. We have to do more research about how false news spreads. Why do people spread false news? How are bots involved or not involved? As I said earlier, bots cannot explain the differential spread of truth and falsity. But that doesn't mean human beings that are malicious actors, troll farms, and so on are not spreading false news. We have to understand the origins of these types of things.
So much more research is needed, both on the phenomenon and what's happening, how it's spreading, but also on possible interventions to curtail the spread of false news. And there has been some work that's ongoing. Most of it is in working paper form. So very little of it has been peer reviewed or published. So we're sort of at the ground zero of our understanding of the spread of false news. There are certain categories of intervention that I think might be helpful. So given that human beings are so central to the process of this type of information spreading online, thinking about interventions that are directed at human judgment and decision making seems like a fruitful avenue.
So I'll give you a couple of examples. One example would be to think about labeling news much the same way we label food in terms of its calorie content, its origins, how it was produced. We get none of that when we read a news article. But when I Go buy something at the grocery store. I can read exactly how it was produced and how many calories it has and how much saturated fat and so on. I'm already picturing one news being all man and others being Cheetos. Exactly. Better for you than the other, right?
And that kind of labeling, given that human beings are making decisions about how to think about what they're reading, whether to share it and so on, could be very valuable in aiding the decisions of human beings and trying to allow them to make smarter choices about what information they consume and what information they share. Much the same way as we try to let people make smarter decisions about what food they consume and share. So it's a signal of quality and veracity. It's a signal of quality and veracity.
It's by no means a silver bullet, because a natural question is, who determines what's true and false? Who determines the quality of this information? How can we come to a societal consensus? None of those questions have been answered. So I'm not claiming in any way that labeling as somehow an easy silver bullet solution, but it's one potential avenue that we need more research on. Another important lever is incentives. So we know now that a lot of the fake news that spread was spread for monetary financial gain, not for any political purpose.
There were people in Macedonia and in other places that would create farms of fake news, fake news factories, in order to get advertising revenue. Because fake news spreads farther, faster, deeper, more broadly, reaches more eyeballs online. People get paid more for advertising next to that content. So it's a very profitable enterprise. Well, if it's currently incentivized, then can it be disincentivized? Can we create disincentives for creating and spreading false news? For instance, can we demote accounts or stories that are determined to be false or less than credible? Can we reduce the flow of advertising dollars in that direction?
For instance, Procter and Gamble last summer, you know, Mark Pritchard, who's the CMO of Procter and Gambler, had said, look, there's not enough transparency in online advertising. We don't know who's watching the ads. We don't know where they're appearing. Are they appearing next to false news? Are they appearing next to questionable content? Well, platforms could crack down on how that advertising works to create a disincentive for spreading falsity online. I also feel that algorithms could be an important part, similar to what Tim said.
I think Tim is moving in the right direction with his thoughts there. Certainly we have an incredible amount of data about how people consume and share news online. In some sense, I feel like we're at the brink of a revolution in our understanding of human behavior because of the dramatic amount of data that we have at population scale, about hundreds of millions of people, even billions of people interacting second by second, time stamp to the minute.
And we can use that data to better understand how the behaviors operate and then how we can tweak and adjust our algorithms in order to dampen the spread of false news, to promote it less often, to encourage people to share true news more often, and so on. So I think algorithms are another important piece of it. In all of these, we need more research. So for instance, in terms of the labeling, I know of at least two studies that provide conflicting evidence about whether labeling works. One study that I know of says that labeling something as being fact checked to be false dampens its spread online.
I know of another study that Facebook did that they wrote about in their data science blogs that claim that when they labeled false news at Facebook that it actually increased the spread of that information. So it's sort of the wild wild west in terms of policing this stuff. And I believe we need to know a lot more in order to develop a firm understanding of how we can solve the problem. I'll also add that it's a moving target. As soon as we develop some sort of defense, there will be countermeasures and so on and so forth. This is not going to be won and lost in a certain timeframe. It's going to be a continuous struggle to combat the spread of falsity.
And I do think that if we allow our society to be consumed by falsity, it's going to paralyze us in a number of ways that we really cannot comprehend currently. Well, Sanand, I wish we were ending on a more positive note, but thank you so much for joining us. It's been a really fascinating conversation. Thanks for having me. And thank you for joining us on this edition of the MIT Sloan Expert series. We hope you join the conversation on Twitter using the hashtag mitslone Experts. I'm Rebecca Knight. We'll see you next time.
Technology, Education, Innovation, Fake News, Social Media, Research, Mit Sloan School Of Management
Comments ()