We are in the midst of a tech revolution. It seems machines are rapidly replacing and outsmarting humans in every arena; algorithms can now translate foreign languages in real-time, protect nuclear reactors in outer space — there’s even an API that translates your words into pirate-speak (FINALLY!) Yet, this year more than ever, one surprising limitation of technology will be outlined.
The biggest game show on earth, the US Election, is nearly upon us. It’s a unique arena, in which grand statements are dispatched thick and fast, but where dishonesty is perceived as so pervasive that a whole industry has formed to separate truth from fiction. For such a simple task, there surely must be an algorithm, right?
Wrong. In recent years, political scientists and tech experts have collaborated in pursuit of what Duke Professor Bill Adair deems the “holy grail” — a fully automated, real time fact-checking engine that alerts audiences when a political statement is untrue. The Washington Post, for example, launched its Truth Teller service in 2013 with the dream of making “real-time” fact checking a reality. Yet three years on, the service has all but disappeared, and despite a degree of success in automating the verification of news stories, we still rely on painstaking human labour to ensure the veracity of our political dialogue. But why is this proving so difficult?
As we’ve seen before, technology and APIs have been at the forefront of significant industry disruption. Recent advances in Natural Language Processing, artificial intelligence, and the automation of search processes could have the power to automate this lugubrious fact-checking process — bringing us closer and closer to the proposed “holy grail.”
The Process of Fact-Checking
Although the concept of separating “true” from “false” seems simple, in reality it is anything but. In fact, seeing the process as dichotomic is a big part of the problem. Political fact-checking entails three distinct processes — extraction, ranking, and verification — each of which poses distinct challenges for those attempting to automate the task.
The first step, extraction, involves identifying what material to check — a task that is notoriously arduous. Dialogue can be difficult to isolate within wider texts, and researchers may need to trawl through a myriad of transcripts and news sources in order to identify checkable statements.
“This is the area where fact checkers lose the most time; many still manually look for news to fact check,” observes Alexios Mantzarlis, director of Poynter’s International Fact-Checking Network.
As such, the first thing an automated system should be capable of is extracting dialogue from larger bodies of text.
To this end, a number of applications may prove fruitful for automated fact-checking. Trooclick, for example, has developed an engine available as an API that extracts reported speech (both direct and indirect) from bodies of text, attributing it to the correct speaker and presenting the results as structured data. The engine can also be used to crawl news and social media platforms across the web, finding all instances of dialogue and delivering it to the user. It thus sources potentially “checkable” material for fact checkers without the laborious search process.
Once material is identified, the next step is to assess whether a chosen claim is objectively verifiable. A straightforward statement such as “the Senator has never voted to alter the second amendment” is most likely polar enough to be checked. On the other hand, a more nuanced comment, such as “the President’s foreign policy decisions have often had a negative economic impact” might pose a problem, as it is unquantifiable or even subjective. Thus, statements need to be sorted according to whether or not they can be empirically proven.
In this sense, perhaps the biggest development to date is Claimbuster, a system that uses a mixture of human coding and NLP algorithms to rank statements according to two criteria: whether the claim is objectively verifiable, and whether there is public interest in whether it is truthful. The system establishes a formula for assessing the ranking of a claim by looking at 20,000 statements from prior presidential debates, and using human analysis to code each one in terms of whether it is relevant to the public and whether it is objectively verifiable. The system isolates factual statements worth validating from non-factual, irrelevant statements.
The system has enjoyed success, with the “rankings” found to correctly isolate 74% of statements. Used in conjunction with a database of extracted statements, the program could certainly help fact checkers prioritize the things worth checking to save time. However, any expansion to the existing database — and hence the statements Claimbuster is capable of ranking digitally — are classified based on human coding. Thus, the system, although innovative, fails to fully automate the ranking of quotes for fact checking.
After the lengthy process of identifying and sorting the potential statements, the “fact check” itself finally take place. This requires researchers to gather information from various sources about the issue in question, evaluate the claim against these findings, and put forward a clear argument regarding the veracity of the original statement. According to Bill Adair, this can take anywhere between 15 minutes and two days, depending on the complexity of the claim. End results may look something like this — FactChecking the Seventh GoP Debate.
In order for this to become automated, the content of checked statements would need to be broken down into some sort of industry-standard digital formula, and cross checked against a repository of structured data with a strict query model. While this approach offers a high degree of accuracy, it is extremely limited in scope. Such a database of “checked” claims would take years to expand enough to be used as a reliable source of answers, and thus fails to usurp human research as the best source of answers.
Could the Mythical Holy Grail be Built Using Various APIs?
Artificial intelligence has come along way — voice-to-text automation has become normal in human-device interaction, development teams are designing bots to automate work processes in Slack, and marketers are even using tone recognition to better respond to user inputs.
If we aimed to design a fully automated fact-checking process, what sort of technology would that involve? There are many APIs already existent that would likely contribute to designing a fact-checking process that is purely machine-driven, from start to finish.
Let’s examine tools that could contribute to the backend for such a system. For a checkable statement like “I never voted against gun control”, an automated lookup would be relatively easy, as it is a simple claim, with verifiable resources. There are certainly gaps in the process, but innovation in a fully automated fact-checking process would likely involve these types of technologies.
- Recognize who is talking: If we wanted a real-time system working in rapid fire debate, automated speaker differentiation would be necessary. With a little setup, something like the Speaker Recognition API by Microsoft’s Project Oxford could be utilized.
- Capture text from recording: many APIs like IBM Watson Speech to Text are able to discern speech from audio, and convert that into plain text .
Now we have all the statements spoken, attributed to each speaker. But, as mentioned above, not everything spoken is “check-worthy.” Our system would need to discern checkability automatically, and improve on something like Claimbuster, ideally without the vast human-coded database.
Creating bots with the API.ai or Wit.ai APIs is rather simple. These APIs analyze spoken commands to infer action, types, locations, and objects. They convert spoken requests, like to schedule a meeting, or to set the thermostat to a certain temperature, into code that a machine will then digest and initiate. The goal of this step would be to use NLP and semantic relationships to convert a statement into machine readable action that could then be queried as either fact or fiction.
- Named Entity Extraction algorithms are able to to determine keywords based on already curated databases.
- Relational data along with sentiment analysis. Isolating the semantic cues of “Subject-Action-Object” would determine the meaning, and label objects in a way that was searchable.
By this point, we assume the machine understands the propensity of individual statements, and has defined actors that can be queried. This final step involves the actual fact-checking process that pulls from as many credible sources as possible. But where does “truth” lie? There are many knowledge databases now accessible via API:
- News databases: IBM News, Overview News, and others.
- General knowledge: Google’s custom search API or mediaWiki API
- Government: Govtrack.us API displays votes in the US Senate and House of Representatives.
Searching at the “speed of thought” using something like the Algolia API would be helpful. The result would compare results across multiple search engines to determine a percentage estimation of accuracy per claim.
Complex claims would likely blend machine automation and human work. We would need a failsafe against anomaly, and that’s where humans come in — they are the best at responding to subjective statements and can naturally detect if something isn’t quite right. As Bill Adair says, fact-checking could take 15 minutes to 2 days — we are nowhere near real time with human lookups. Normal fact-checking is an art, but could the process be accelerated by crowdsourced lookups?
Applications can call the FancyHands API to automate human tasks. But for a larger scale, AmazonTurk could potentially harvest thousands of human lookups in seconds. An integrated crowdsourced fact-checking system would propose a single statement to a pool of verified users, who would then perform manual legwork using internal or external resources to determine validity.
Imagine watching the next party debate. Underneath each contestant is a rating that automatically displays the percentage accuracy of their statements. As the data first comes in from the machine automation, the percentage estimated accuracy is immediately displayed. As more data comes in after human crowdsourced lookups, the data would become less of an estimation, and more credible toward an established truth or lie.
A clear rebuttal to the crowdsourced approach is that fact-checking is an art. Even assuming enough people would be interested in participating, you can’t always break it down into tiny, repeatable tasks that any citizen could complete in 30 seconds or more. A real time purely automated fact-checking system like this is still hypothetical, but perhaps attainable with future algorithmic research, linguistic advances, and a smart mashup of this tech.
Back to Reality
So on the cusp of the most scrutinized (and perhaps polemic) election in American history, are we any further forward in our quest for automatic “truth”? In short — not as far as we’d hoped. Current technology might well enable us to identify political dialogue and rank “checkable” statements more quickly than ever. But, for now at least, the inherent nuance of language, as well as the complexity of matching questions and answers digitally, continue to render automated fact-checking an elusive goal.
Alexios Mantzarlis, Director and Editor of Poynter’s International Fact Checking Network, conceded recently that “fact checkers may have to put the “holy grail” to one side for now”, as we continue with this more modular approach.
“It’s great to think about it as a whole process, but what fact-checkers need now is small steps to take them a long way.”
What fact-checkers need now are tools that help automate specific steps within the process, and perhaps the API space is prime with the tech and knowledge for such development to occur. With the US election campaign set to dominate news rooms this year, an international fact-checking conference will be held March 31st at Duke University, North Carolina.