Are CNN and Fox News Really Biased? A Machine Learning Study.

Dhruv Mangtani
The Startup
Published in
5 min readJul 22, 2020

--

According to statistics, the answer is no, and here’s why.

About a year ago, I was scrolling through Twitter and reading some replies to a CNN tweet. As you would expect, it was full of people referring to the media conglomerate as “fake news” and pointing out details in the way CNN chose to frame that specific story. And to some extent, you really can’t dismiss their claims. Perhaps Fox News, CNN, and the “mainstream media” don’t purposefully publish outright fake information, but that isn’t a great standard to hold them at. When millions of Americans vote based on what they read online, the media is partially responsible for popular representation in our democracy. When exaggeratory clickbait drives the headlines of even the most prestigious news organizations (e.g. the New York Times), we are right to tweet back at them.

So this begs the question: are CNN and Fox News lying to me when they say an article is not an editorial/op-ed? Who can I actually trust for reliable information, and through what medium?

A Look at the BLUFFNet Algorithm

Before I go into my results, I want to give you a sense of how they were created. At the core of any machine learning algorithm or statistical analysis is data, with the caveat that the data usually must be annotated. In Natural Language Processing terms, the machine learning task at hand is known as Subjectivity Analysis (similar to Sentiment Analysis): identifying sentences that represent the author’s opinion. Unfortunately, labelled news–article–specific data for this task is quite scarce; the best dataset I could find is the MPQA corpus, containing a total of ~11,000 sentences from 692 articles.

These sentences are converted into vectors of numbers which are then fed into a machine learning model, a neural network. Neural networks have the benefit of being able to learn complex patterns from complex data types, namely natural language. The numbers in these vectors represent the grammatical structure of a sentence as well as identify relevant keywords such as “amazing”, “evil”, or “factual”. By creating patterns from the MPQA sentences with these vectors, we are well equipped to classify unseen sentences as subjective or objective.

Now we can determine whether a sentence in a news article is biased, but we still need to find out how biased the entire article is. One way to do this is by taking the average of all the sentences’ biases, but even biased news articles tend to be mostly filled with facts and quotes. A better way is to scale the averages according to how similar they are to objective vs opinionated articles. Since we are asking whether non-editorial articles are biased, it wouldn’t make sense to use news articles for the objective sources. Instead, we take a corpus of Wikipedia articles and process their average biases. We also do the same for a corpus of editorials. This creates some baselines—a random article will likely fall somewhere in between these two extremes. The average bias is scaled between the two using a Logistic Regression model, and the result is the final bias score. If you want to learn more about the algorithm, BLUFFNet, feel free to read the paper preprint here.

Now, in order to judge a news source as a whole, we simply take the average of their articles’ bias scores. I recently developed a Chrome Extension which labels subjective articles on the Google Search page (you can find it here). In order to pre-evaluate articles, I created a web crawler that catches RSS feeds and scrapes the Google News website. I was able to use my database of 16,000 classified articles to gather samples for this study.

What do Americans think about media bias?

According to a Gallup/Knight Foundation poll of Republicans and Democrats, Fox News and Breitbart tie as the most biased, with MSNBC, HuffPost, and CNN trailing behind. On the other hand, PBS, APNews, NPR, and WSJ are considered the least biased.

So, for the moment of truth, where does everyone really rank?

Split into quadrants, results of news bias poll compared to BLUFFNet judgement of news bias

There’s a lot to unpack here, so I’ll summarize some of my key findings. According to BLUFFNet, the two least biased sources are CBS News and NPR, and the most biased sources are Vox and HuffPost. When we discount op-eds and editorials, CNN and Fox News are the third and fourth least biased news sources, respectively. Interestingly, Americans think of the two organizations as being among the most biased. The truth is that when they say they are reporting facts objectively, they usually are.

While many claim that the New York Times has become increasingly biased over the past few years (just take a look at Bari Weiss’ resignation letter), they still remain in the safe zone of a bias score under 50, aptly falling into the fourth quadrant. On the other hand, the Washington Post tends to write more biased articles than Breitbart, which is very surprising. One (possibly post-hoc) explanation for this phenomenon is that Breitbart’s far-right articles are more extreme than the Washington Post’s left-leaning articles but represent a smaller proportion of their total website.

Most of the rating pairs—machine and human—agree with each other, except for a few anomalies. This trend indicates that Americans are very aware of the current state of news and have an accurate understanding of which sources to trust. The ones that don’t agree fall into the fourth quadrant: The New York Times, NBC News, CNN, Fox News, and to some extent, Breitbart. These sites are considered highly biased by the American public, but really aren’t (OK, I’m also a little confused about Breitbart, but it isn’t totally in the quadrant). Conversely, none of the websites surveyed were thought to be objective, yet were found to be subjective. Interesting. In machine learning terms, human classification of news sources has high recall, but low precision.

Last Thoughts

Personally, I will rely more on sites like CBS, NPR, CNN, and Fox News for keeping up to date with current events. However, I still don’t recommend watching CNN and Fox News on TV — this study was restricted to what organizations post on their websites.

People simply don’t trust the information they receive from media, whether the medium be news websites, social media, or television. Media bias can be harmful if we are unaware of its effect on our votes as it ultimately skews election results. We have to spread awareness about which organizations are manipulating us simply to improve their bottom lines by creating drama and begging for attention. Americans need to be independent in their votes, yet receptive to new information.

--

--

Dhruv Mangtani
The Startup

17y/o @ BLUFFNet, SnapGrub, Virtual Rewards, Nudge Debate, Ezi