Exploring the Cloud Laboratory: Advances in Biotech & Science-as-a-Service Vassili van der Mersch May 17, 2016 It isn’t much of a stretch to imagine a world where biology is accessed and modified through API calls. Not only is this not science fiction, but the first laboratory science APIs are already available. Science-as-a-Service (SciAAS) companies have started popping up, with the intent to accelerate scientific research and improvements in biotechnology. Just as Amazon Web Services exposes Amazon’s excess infrastructure through their APIs and SDKs to host the code for hundreds of thousands of developers around the world, the same is now being done with scientific equipment; computing power and even entire lab experiment protocols can be exposed and consumed by remote teams. By letting researchers outsource the expensive drudge work of conducting experiments, teams save time and money without compromising the quality of scientific research, thus reducing the barrier to entry for biotech startups. Making scientific testing efficient and programmable will enable anyone with the needs and ideas but lacking the resources to turn their experiments into a reality. A Marketplace for Experiments Scientific experiments are time and labor-intensive affairs, and require equipment and expertise that are hard to come by. Researchers often outsource the nuts and bolts of these jobs to Contract Research Organizations (CRO) such as universities or commercial laboratories, in order to free up time and energy to focus on experiment design and analysis of results. Outsourcing lets owners of underused resources leverage their facilities for other scientists’ experiments. Working with a CRO involves writing an experiment protocol and sending it over to the facility to have the experiments done. The protocol details the materials, processes and measurements underpinning the experiment. The CRO business is a huge market — as large as $25 billion in 2015. There are specialized CROs for the different types of experiments like bioassays and clinical research. In fact, finding and contracting the right CRO can be challenging, which is where Science Exchange comes in. Science Exchange is a marketplace for scientific experiments — in other words, an ‘Uber for science’. Universities (including the Harvard Medical School and Johns Hopkins University) and private laboratories offer their services via the website and scientists can select one of them and order experiments to be run in their facilities. Biopharma companies like Genzyme use Science Exchange to outsource a part of their research efforts. Much like you can find a web developer or a graphic design on sites like Freelancer, service providers on Science Exchange list the services that they offer on their profiles, including immunohistochemistry and RNA sequencing. When a client posts a new experiment request on Science Exchange, relevant CROs can bid for it. Science Exchange takes care of contractual details and facilitates payments through their website. Each service provider is pre-approved by Science Exchange staff, and has a reputation score. Also see our list of 4 Other Growth Sectors For APIs Computing in the Cloud When dealing with molecular data, the amount of number crunching that takes place breathes new life into the term Big Data. To avoid each lab having to manage their own infrastructure, platforms have been built that let scientists outsource the computing part of their research. Agave is a platform that focuses on experiments where data is abundant and computing power matters. It promotes the Open Science movement and facilitates sharing between bioinformaticians. It was originally conceived while building infrastructure for an ambitious project to compare gene frequency data of animals across different species. Agave is publicly funded and managed by the University of Texas. It has a catalog of science APIs that users can draw upon and offers free cloud infrastructure based on Docker containers for scientists. In a world where scientific code is usually not portable, Docker allows images to be published to registries and reused as templates for someone else’s experiments. This approach has the potential to help share study results and processes more easily. Indeed, a scientist can spin up a new simulation based on a Docker image provided by a colleague. The Cloud Lab The companies that most embody the ‘AWS for Science’ designation must surely be Transcriptic and Emerald Cloud Lab. Both companies offer futuristic on-demand, API-driven and robot-operated molecular and cell biology research facilities. Where Science Exchange matches scientists with CROs and Agave lets researchers discharge their computing needs to the cloud, Transcriptic and Emerald let you manage a fully fledged biotech facility automatically and remotely. “Access a fully automated cell and molecular biology laboratory, all from the comfort of your web browser.” -Transcriptic.com Instead of hiring CRO’s, the drudge work of pipetting and transferring liquids between machines is automated, with robotic arms doing the honors with little human intervention. Clients submit work orders via a user interface or an API call, and receive data feedback at each step of the experiment. The underlying idea is that many experiments in biology use the same basic operations on different material and in a different sequence. A typical experiment involves centrifugation, plate reading and a small number of other operations that can be handled in sequence and fully automated. The most impressive aspect of these companies is that they are both thriving in this ambitious quest to usher in a new era of life sciences entrepreneurship, in which a small cash-strapped team can create and manage a profitable pharmaceutical company from a laptop, much like what can be achieved today in web startups. Stanford University, Berkeley University, UC Davis University and DeskGen (a gene editing startup) are already using Transcriptic. Their aim is to change the way research is done, dramatically offsetting the ever-increasing costs of clinical trials, automating tedious lab work, and accelerating research by running experiments in parallel. They currently offer common protocols like PCR for genotyping animal samples, DNA/RNA synthesis, and protein extraction. More complex or custom experiments are still better delegated to a CRO, but in the future all experiments may be conducted in this way. It will also likely change the way scientists plan experiments. With human-operated science, every additional step in a lab process incurs exponential cost and increases the likelihood of human error and deviance from the protocol. APIs like Transcriptic will liberate researchers from these compromises, and will lead to many more experiments. In particular, this novel approach opens up the possibility of cheaply and efficiently reproducing past scientific experiments, which is a perplexing problem in the field. It also promises to drastically reduce the time and cost of getting new pharmaceutical drugs on the market. Since each part of the experiment is monitored, Transcriptic and Emerald let scientists rerun past experiments with tweaked parameters, easily exploring what-if scenarios that would previously have been left as question marks. While not yet a good fit for generalist biologists, cloud biology solutions like Transcriptic and Emerald will appeal to biotech startups and biology professionals who double up as programmers. Also read: Continuous Integration is Coming Soon to the Internet of Things A Window into the Future These are exciting times to be in biotech. Companies like 23andMe have shown that you could build a new kind of company in this field with their DNA analysis service, using a mix of biology and information technology to create a product that would have been deemed science fiction only a few years ago. Gene editing is also starting to emerge and promises to change the way we think about our bodies. Other companies like Zymergen are figuring out how to scale scientific research. Zymergen creates robots that are able to perform DNA manipulations on microbes, which in turn are used in the manufacture of chemical compounds with the aim to scale up manufacturing. There are hints that cultural change is also imminent in scientific research. The reproducibility crisis can be addressed by making experiments easier to replay using Science Exchange or Transcriptic, and new financing methods are becoming available to counter the toxic “publish or perish” tradition in academia. Currently scientists need to publish study results often to ensure public funding, and this has detrimental effects on the kind of science that is worked on, and conflicts with long term thinking and other priorities such as teaching. Crowdfunding sites like Experiment offer an alternate means of funding. Plenty of technological advances are sprouting up to help scientists set up and manage experiments. Small labs can get some of the benefits of a solution like Transcriptic in house by using robots like the OpenTrons OT-One. Sophisticated facilities will have more options than ever. For each experiment they’ll be able to choose between on-premise automation, outsourcing to a CRO, or to something like Transcriptic or Emerald. Now that experiments can be initiated and monitored with API calls, they can be integrated into workflows and even products. Riffyn is another noteworthy company in this space. It specialises in process design and analytics to help scientists design experiments and collect data at each step, pinpointing weak points in a process to make improvements and avoid experimental noise. There will be plenty of work for bioinformaticians in coming years. Data formats will be important moving forward and a new form of literacy will likely evolve from the cloud laboratory. An open standard protocol for describing experiments called Autoprotocol, initially developed at Transcriptic, aims to introduce a data representation layer for bioinformaticians. Just as naming things is important in software development, so will named abstractions take root in biotech.