Learning From The Cambridge Analytica Incident

Unless you’ve been hiding under a rock, you probably have read a lot about data privacy in the headlines recently, specifically regarding the Facebook and Cambridge Analytica debacle. As reported by all major news outlets, the story goes that Cambridge Analytica harvested Facebook data through a quiz app that took the personal information from friends of users without knowledge or consent; an estimated 50 million Facebook accounts.

Mark Zuckerberg and Facebook have received “intense scrutiny over privacy and security” concerns, resulting in trips to U.S. congress. While it doesn’t constitute your typical “breach” from a technical standpoint, the Cambridge Analytica incident is a legal breach of sorts.

We’re not going to dig into the details of who is responsible or whether or not Cambridge Analytica was effective in using Facebook user information to influence the 2017 U.S. election. Instead, there are some larger implications software providers may want to pay attention to.

“I suggest we have much bigger issues to confront, like revisiting current business models and rethinking data privacy rules.”
Jonathan Albright, Columbia University

This recent public outcry surrounding data privacy presents a unique opportunity to revisit what we can do to keep API access in the hands of those who deserve to have it. What does the recent user data privacy concerns regarding Facebook and Cambridge Analytica mean for the API economy? In this article, we identify 5 potential ripple effects, along with what API providers can do to respond.

5 API Economy Ripple Effects

Web APIs, especially public-facing social media APIs, connect much of the modern web. They also sit at a unique intersection of privacy and data, as they are the mode through which user data is often shared between online services. So where do APIs play a role in the Cambridge Analytica debate? Well, the app used the Facebook Graph API to collate user information.

Large breaches of sensitive user data is nothing new, the Equifax breach of 2017 being one of many in recent years. However, unwarranted use of data by means of the Facebook Graph API could influence technology providers to revisit their approach to engaging with their API ecosystem. People are requesting more data privacy, and EU regulatory initiatives like GDPR are seeking to correct this by regulating user consent. With this recent incident now in the foreground, there are some curious ripple effects that API owners should consider.

1) Screening: Greater emphasis will be put on screening new third party apps

Learning from the Cambridge Analytica incident, API providers may want to screen new app submissions with more vigilance. This means more in-depth app developer background checks on those who request API access.

“The story emphasizes the importance of having a real time awareness and response to API consumers at the API management level.”
Kin Lane, API Evangelist

When API providers grant access to an app developer, what is the process by which that developer is screened and approved? Is the agenda behind the app known to the API provider? Before granting API access to third party developer applications, providers should ensure the app meets their platform policy, without relying on the good faith of third party developers.

Third party app approval process varies drastically depending on whether the API is public, free, partner, monetized, or private. In many scenarios, an API key is supplied to the consumer quite readily after requesting access. Speed is often a competitive advantage; necessary to ensure a quick onboarding process and sleek developer experience.

Though we’ve triumphed quick onboarding processes by SaaS services like Twilio in the past, the fact is they work in a different vertical; the API is a monetized SaaS product. Automatic app authentication in a free public scenario where user data is involved, on the other hand, can be dangerous.

2) Monitoring: Pay closer attention to how your API is used and who uses it

In the spirit of screening new app developers, API providers may want to audit their internal userbase. Internal teams should try to weed out API consumers that use data inappropriately, but once sent, it’s hard to track where the data ends up.

What can be done is setting up monitoring in the API management layer to improve the security of the platform as a whole. Typical API testing is concerned with uptime, speed, validation testing, among other focuses. Monitoring can also encompass specific endpoint tracking to watch for suspiciously crafted API calls and anomalies, such as unexpected high traffic.

Getting some sort of real-time monitoring system in place will help establish a status quo for analytics, which can be used as a baseline to gauge the ongoing health of your platform. To avoid consumers from harvesting large amounts of data, stem the flood with rate limiting and other data handling techniques like pagination.

3) Platform Policy: Make it human-readable and obvious

Having a clear, human-readable platform policy is important so that stakeholders do actually understand their limitations and repercussions for breaking them. There are often many people involved in the app development process, and hardly any will read dense platform legalize.

In the case of Facebook, the developer who worked with Cambridge Analytica, a professor at Cambridge, didn’t seem to grasp the sweeping implications of breaking protocol by harvesting user data for hidden purposes. As the NY Times reported:

“Back then, we thought it was fine. Right now my opinion has really been changed,”

Aleksandr Kogan, the professor who was hired by Cambridge Analytica

While the public wakes up to the implications of their online actions, having better, more publicly understood stipulations on what can and cannot be done will help ensure consumers understand the contract they are agreeing to. Regarding the relationship between tech firms and end users, the same disregard to legalize rings true. To solve this, we may see more micro-permission granting occur along the user journey to confirm the sharing of isolated data points.

4) Consolidation: Public APIs wane and consolidate

Shortly following the Graph API incident, Instagram immediately shuttered API endpoints related to user data. This is certainly not the first major API retirement we have seen, but a likely ripple effect from the recent incident. Varying motivations like profit, brand ownership, or data protection have likely influenced other deprecations social API space space as well.

Another way to look at it is: the more services you have, the more attack vectors are present. Along those lines, keeping low traffic endpoints alive could be an unnecessary vulnerability. API providers may want to internally audit their stack to make their platform is as lean and manageable as possible.

If faith in third party consumers continues to wane, we will likely see more deprecations from large public APIs as they continue to consolidate their services into more partner-oriented, closed businesses. What will be interesting is to see how they can, at the same time, meet regulatory requirements on user data accessibility.

5) Regulation: Read up on what is legal

GDPR renews EU legislation to make sure data brokers explicitly state what user data they store and for what purpose. As many companies broker data across international boundaries, there is mounting pressure from non-EU countries to adopt GDPR as well. If not, they risk some serious segmentation and standardization issues when it comes to storing and acting upon user data.

While regulation compliance may not be everyone’s dream day job, there is a silver lining. GDPR’s emphasis on user consent and data responsibility could result in a smaller threat of public data breaches. By legitimizing contracts between companies and users through threat of punitive action, you create far more incentive to keep user data safe and in the hands of only those with sanctioned access.

We’ve also discussed the benefits of open standards on the blog quite heavily. Non-EU countries storing data internationally should evaluate if they should jump on board or not.

It should be mentioned that GDPR is not the only regulation protecting user privacy. Federal laws in US and in most countries exist, and so do more regional ones, like Illionois’ Biometric Information Privacy Act.

Final Thoughts

Regardless of the political implications from the Facebook and Cambridge analytica incident, a breach this large involving an API is likely to create a stir and get people to rethink how their program is designed and partitioned for public use. It can be tough to police what people do with your data. As Kin Lane says:

This situation I think highlights another problem of doing APIs, and ensuring API consumers are behaving appropriately with the data, content, and algorithms they are accessing.”

In short, if you do handle user data in your platform, consider:

  • Improved measures to screen the apps you do allow to use your API
  • Taking an active approach to monitor the API in the wild
  • Ensuring third parties are respecting domestic and foreign data privacy laws.

While it’s true that much of the big data, web economy relies on selling data, it seems that business models that disregard user data privacy could have some serious negative consequences.