Case Study: Lessons Learned Making the Tinder API Gateway Posted in Security Kristopher Sandoval March 14, 2023 Tinder is one of the world’s largest and most well-known dating applications. As of 2020, Tinder reported 6.2 million subscribers and 75 million monthly active users, and this mass amount of users generate an incredible amount of data and interactions daily. These interactions touch a large array of systems throughout the underlying data structures and systems, with Tinder self-reporting over 500 microservices. These microservices carry an incredible amount of data, but they also underpin a system that is naturally very privacy-centric. Dating can be difficult in any situation, but especially so when sending your data to a third party to set up your first date. Accordingly, Tinder has an interesting problem: how do you surface data and enable public APIs while ensuring security and authorization across the board? Tinder’s solution was to create the Tinder API Gateway (TAG). The team behind TAG put together an amazing writeup of the internal technical functions of the TAG, and their example workflow bears repeating. Below, we’ll dive into this system and looks at the lessons learned at Tinder during its construction. The Underlying Problem Tinder hosts an incredible amount of data across a vast system. With over 500 microservices working in tandem to deliver different functionality, Tinder identified a major need for some sort of service mesh that would centralize, standardize, and unify the provision, control, deployment, and maintenance of the overall system. To solve this problem, Tinder created TAG — the Tinder API Gateway. For Tinder, a strong backend service mesh wasn’t enough, as it would have to be highly controllable in terms of versioning and security. Tinder is used in 190 countries and handles both legitimate traffic from its users and traffic from bad actors — accordingly, the API needed to be extremely strong and resilient while also being scalable and efficient. These needs can be challenging to balance and requires careful forethought and planning. The Existing Systems Before the development of TAG, Tinder utilized several different API Gateways in an attempt to solve the underlying problem. This system resulted in each team using a third-party API gateway to support their part of the puzzle. Although these solutions were adequate in isolation, as part of a greater view of the entire system, cracks began to form in the overall process. In adopting these multiple different systems, issues of unification and standardization began to appear. Different teams and aspects of the application require different development timelines, resource demands, and processes for deployment and maintenance. Different gateway solutions resulted in components that were not reusable and portable, resulting in internal incompatibility. These underlying concerns resulted in delays in shipping production code and lags in resolving issues. Ultimately, the system was too complicated, and while it solved individual team concerns, it did not serve the overall needs of the system. The Desired System Tinder finally reached the point where it was apparent that its current system would no longer work. Accordingly, Tinder began to look for alternative methods by first stating what their desired system would look like. Their new system would have to unify multiple complex systems, creating a streamlined design with greater controls. It needed to be portable and extensible so that teams could iterate and scale upon it — a need which also meant they required something that could run instances as Kubernetes. They ultimately needed something that would add new functionality while maintaining development velocity through configuration-driven systems and ensuring their current security was at least matched, if not better. As Tinder looked for new solutions, they briefly considered existing third-party implementations. Unfortunately, they found that each solution did not meet their requirements. Third-party integrations were typically quite configuration heavy and utilized extensive plugins to deliver functionality, resulting in a high learning curve. These solutions were also typically language-specific and rarely matched with the development stack at Tinder. They also found that third-party solutions were often poorly integrated with their existing Envoy mesh solution, representing an additional hurdle to standardization, scalability, and extensibility. Ultimately, Tinder decided that the only thing it could do would be to create its own solution. They set out to do so with a rough idea of what they wanted. Building The Tinder API Gateway (TAG) Tinder settled on building atop the Spring Cloud Gateway, a highly flexible solution built on Spring WebFlux. Spring Cloud Gateway offers highly flexible routing dependent on various factors. Equally important, it offers a comprehensive security system, including relay tokens and downstream authentication to route these requests in a fundamentally secure environment. This additional security is a major selling point for Spring Cloud Gateway and was leveraged by the Tinder development team. TAG aims to unify and centralize external APIs and implement a robust security pathway. As such, the team heavily iterated upon configuration-driven instance creation as a methodology to provide a strong gateway without extreme overhead. It does this through a set of core functions: Routes: Exposed endpoints through Route As a Config, exposing the traversal points in an API interaction. Service Discovery: Backend services are surfaced using a service mesh to connect Tinder’s large microservice collection. Predicates Built into Spring Cloud Gateway, a Predicate is essentially a test that determines whether the request fits a specific form before filtering. Filtering: The ability to filter traffic and handle it using a pre-defined set of expected functions and output pathways. Pre-Built Filters: Pre-configured filters that allow for rapid iteration and development. Custom Filters: Filters that are custom created for specific logic required by a team or service. Global Filters: Custom filters but expanded to the global domain, allowing for rapidly creating and deploying new functionality and filtering. The TAG was built to function like a logic circuit. A gateway client feeds a request into the API gateway, which is fed into the route config, which works with the handler and the loaded predicate to either accept the request or terminate it. Once the request is accepted, it is sent through the Global and Custom Filters, then to the Post Filters, and finally, out to the internal service. One way to think of the TAG is to consider an old-fashioned coin sorter. Coin sorters have one channel that accepts a variety of coins, and, depending on the size of the coin, additional channels help to route that coin to a specific slot at the bottom of the sorter. When a new coin comes in, it must match the expected condition to get sorted, and if it fails this test, it’s rejected. In the same way, the TAG functions to filter and handle requests and slot them into the correct internal service point. This filtering helps eliminate poorly formed requests and creates a secure layer of abstraction that allows for data transformation and manipulation without surfacing the data or the internal endpoints themselves. As such, TAG can essentially function as a framework for the larger Tinder team due to its processing and filtering abilities. Because TAG can spin up individual instances through configuration-centric development, new teams can iterate and integrate, bringing new services online in a way that is interoperable at an incredibly high level. The Future of API Development Tinder has shared a great example of what an in-house partnered solution might look like. For many API developers, the answer is often “integrate a third party,” leveraging existing solutions to simply deploy and move on. Often, however, this solution could fall flat, and the things developers compromise on can lead to other issues in the future. Tinder’s TAG is an excellent example of when using an in-house developed solution paired with an external open platform makes sense. While this approach has potential pitfalls, the benefits are clear — greater control over the system. Custom-built solutions will always be a closer fit than a third-party turnkey solution. The future of the modern microservices ecosystem requires more complex handling than ever before, and TAG is a strong example of secure routing and filtering that results in a system that is more than the sum of its parts. Organizations looking to deliver complex microservice functionality while reaping the benefits of distributed teams and resources should consider TAG’s approach, as allowing rapid iteration while still securing the underlying stack is the microservices dream. What do you think about TAG and the concept of the API gateway at this scale? Are there any organizations you think would benefit from this sort of approach? Let us know in the comments below!