How APIs Should Respond to Data Sovereignty

How APIs Should Respond to Data Sovereignty

Countless country-specific data privacy and handling laws have emerged on the tech scene in recent years. GDPR, CCPA, and others demand strict adherence, complicating operations for software-as-a-service like APIs that hold and process data across international borders. As a response, data sovereignty has become more relevant than ever.

To ensure API compliance with data sovereignty requirements, organizations should implement robust security measures, ensure clear data storage and processing locations, and comply with local regulations while also maintaining clear internal policies and procedures.

Below, we’ll provide a more detailed breakdown of data sovereignty and how it specifically affects APIs. We’ll cover some strategies organizations can take to keep up with compliance and competition in this increasingly complex regulatory landscape.

Understand Data Sovereignty Regulations

Before any provider can understand how they should respond to data sovereignty regulations, they should first understand what these regulations mean and how they affect their APIs.

Regulations vary widely depending on where the API is located and what laws the users are covered under. As such, this requires due diligence and deep review. Firstly, API providers should look at what laws cover their services in their location of operation. While data sovereignty is not necessarily limited to what the API provider is subject to, this will give a solid base of understanding. Regions like California have very different privacy and data collection practices than Japan. In many cases, a business operating in one location might be covered under those laws — even if their principal servers are not in that location.

Next, providers should conduct an audit to understand where their users are principally located. The concept of data residency — that data is affected by laws depending on where the data was generated and where the data is stored — will play a huge role in how this data is handled and what protections it has under the law.

Finally, providers should audit any transit paths. Data generated in one country but exfiltrated to another country may be subject to both sets of regulations. When data passes through an intermediary country, it may even be subject to this tertiary country’s set of rules due to this transit phase.

This process should help API providers understand data jurisdiction, protections, and potential regulations. Once you have a complete vision of where the data originates, flows through, and ultimately is stored, you should have a more solid idea of what regulations you might be beholden to. With this visibility, you will then need to review the major relevant regulations, such as GDPR, CCPA, PCI-DSS, and other regional and industry-specific laws, and review whether your data is subject to their jurisdiction.

The takeaway: Understand what laws may potentially cover your data and users. Regulations remain in place whether you choose to understand them or not.

Consider Necessary Data Collection

One major remedial step for this entire process is to consider what data even needs to be collected. In many cases, API providers collect a good deal of data that may not have immediate utility. In the Wild West of the early internet, this was fine, as collecting a bit more data than necessary only incurred some small additional storage costs. In the modern era, however, this opens up data points to potential exposure, litigation, regulation, and financial penalties.

Accordingly, providers should review their data collection and see if there’s anything that can be cut — and if this action has any impact on their data handling and regulations.

For instance, API providers might freely collect a range of data that, when combined, qualifies as personally identifiable information or PII. Such data is subject to controls and regulations under GDPR. In a remedial step, API providers might be able to limit their potential exposures (as well as their potential headaches) by simply choosing not to collect this data, since it provides no utility and only incurs additional regulatory burden.

The takeaway: Minimize your potential issues by collecting only that which is necessary to provide the service.

Deploy Adequate Security and Access Controls

With data value and purpose established, the next step is for API providers to deploy and audit adequate security and access controls. Most regulatory frameworks are quite opinionated on the topic of security and access control. The best approach, then, is for API providers to implement a system that is as secure as possible while ensuring they meet the expectations of the regulatory framework.

For instance, under GDPR, the following is a stated requirement for data processing:

Taking into account the state of the art, the costs of implementation and the nature, scope, context and purposes of processing as well as the risk of varying likelihood and severity for the rights and freedoms of natural persons, the controller and the processor shall implement appropriate technical and organisational measures to ensure a level of security appropriate to the risk, including inter alia as appropriate:

(a) the pseudonymisation and encryption of personal data;
(b) the ability to ensure the ongoing confidentiality, integrity, availability and resilience of processing systems and services;
(c) the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident;
(d) a process for regularly testing, assessing and evaluating the effectiveness of technical and organisational measures for ensuring the security of the processing.

This means that data must be adequately protected through technical measures, including anonymization of data, encryption at rest and in transit, and given substantial access controls to ensure that the data is confidential and maintained. While GDPR doesn’t provide an exact specification for what ‘good’ looks like, many provisions demand reasonable faith efforts to establish these systems. In other words, you should use as secure a system as possible instead of just doing the bare minimum.

This might seem confusing for US providers — why should GDPR influence how they operate? United States APIs are subject to GDPR in many cases, such as when the user in question is an EU citizen whose data has been collected and processed. When a French citizen purchases an item from a US shop, they are generally covered under the GDPR, regardless of the data being stored in the United States.

Accordingly, many US companies have clamored to ensure that they are compliant with GDPR to maintain access to the global market. For APIs, this requires implementing solutions that are appropriate to the local jurisdiction and potential international jurisdictions, even if the finer details of data protection enforcement or jurisdiction are constantly being evaluated and tested in the courts.

As part of many regulations, organizations are also required to regularly audit their systems and ensure that their security is implemented and maintained. For this reason, API providers will benefit significantly from integrating tooling for logging, monitoring, and observability to ensure that security access controls are established and maintained.

The takeaway: Implement advanced security, access controls, and encryption at rest and in transit.

Implement Strict Data Residency

Another approach to this problem is to establish and maintain strict data residency. Data residency simply means that data is subject to the controls of where the data is generated and stored. This can take a few forms in the cases we’re discussing here.

The easiest one is to colocate the data with its generation source. If you generate data on an EU citizen living in the EU making a purchase via a US storefront, it will be infinitely easier to spin up a virtual server on an EU server resource that is GDPR compliant and controlled within the region itself. By ensuring that you retain the data where it is generated on a compliant system, you mitigate potential jurisdiction issues or complexities from data transfer.

Speaking of data transfer, you can establish a system by which these colocated data centers still render usable and transferable data. By using anonymizing systems such as k-anonymity, you can render a data set anonymous enough to then be transferable and usable in other locales which do not have stringent requirements or compliance frameworks. This allows you to comply with something like GDPR or CPPA while ensuring that you still have the portability of the data in question.

In all cases, providers should ensure they commit due diligence in documentation and compliance efforts. Document what data is being collected, why, where it’s being stored, and why and how it is processed. Then, review this data to ensure it aligns with the data sovereignty requirements of the jurisdiction the data is covered under. Taking it a step further, providers should only work with vendors who are themselves compliant, which can be helped significantly by defaulting to local providers where possible in regions with strict jurisdictions.

The takeaway: Respect data residency and location based on best practices.

Establish an Internal Culture of Data Sovereignty

Providers should strive to establish an internal culture of data sovereignty. In much the same way that an internal security culture can do wonders to establish a secure offering, establishing a culture that understands and respects data sovereignty will go a long way towards ensuring that data is appropriately handled.

First and foremost, develop and implement a comprehensive data protection policy that outlines how the organization handles data. This should include all data sovereignty-related matters and related concepts and models such as consent management (mechanisms for obtaining consent and providing individuals with clear information about how their data will be used) and processes for data erasure.

At this stage, it might be helpful for providers to engage in full-scale data mapping. By mapping the data flow from ingest to storage, you can ensure compliance with sovereignty regulations while also identifying potential pain points or attack vectors, thereby increasing your overall security quite handily.

The takeaway: Adopting an internal culture of data sovereignty is as important as an internal culture of security.

Rework Your Technical Solutions

Finally, providers should review their technical solutions and ensure compliance. If they are not, small changes may be deployed to fix the problem at scale.

For example, let’s imagine a provider has a service that collects a large amount of user data on signup. This data is collected after a user signs up. But by then, it’s already too late to implement GDPR and other data regulations, as the data is ingested and stored in a geolocated US resource.

The simplest solution here is to implement a simple API gateway that filters traffic depending on IP. If the user microservice (let’s call it appUsers) simply collects all data, we can change how the service functions by breaking out the data storage from the user account record. When a user enters the system, they can go through an API Gateway that checks their IP, prompts the user to state their regional geolocation, and then routes their request to another EU-located microservice which prompts the data collection. For non-EU users, this data might be automatically collected without the additional microservice prompt.

In some cases, you can even implement such a gate as a consent gate rather than a geolocated restriction. Instead of using regionGate as a gateway, you can create a microservice that leverages a flag to reject data requests from other internal services that might store this data against GDPR or other regulations.

In this case, your data storage would be GDPR-compliant because the EU-specific service — euAPP, in this case — would use the regionCheck data to validate that no other data would be collected and put into (or retrieved from) data storage. In essence, you would have the regionCheck service acting as a middleman to oversee the internal functions of the API, ensuring that compliance is maintained.

The takeaway: Consider technical solutions before considering complicated soft solutions.

Conclusion

There are many, many ways to build this out, but one reality remains true: data sovereignty is not going away, and ignorance of regulations will not stop those regulations from being applied. The internet is not the Wild West that it once was. As regions become more nationalistic and their networks more protective, this reality will only become more stark and more important.

Getting ahead of these issues is an important first step towards ensuring that you and your APIs can comply with current — and future — regulations.