Chapter 3. Leukippos: A Synthetic Biology Lab in the Cloud

Pablo Cárdenas, Maaruthy Yelleswarapu, Sayane Shome, Jitendra Kumar Gupta, Eugenio Maria Battaglia, Pedro Fernandes, Alioune Ngom, and Gerd Moe-Behrens

Abstract

As we move deeper into the digital age, the social praxis of science undergoes fundamental changes, driven by new tools provided by information and communication technologies. Specifically, social networks and computing resources such as online cloud-based infrastructures and applications provide the necessary context for unprecedented innovations in modern science. These tools are leading to a planetary-scale connectivity among researchers and enable the organization of in silico research activities entirely through the cloud.

Research collaboration and management via the cloud will result in a drastic expansion of our problem-solving capacity, since groups of people with different backgrounds and expertise that openly gather around common interests are more likely to succeed at solving complex problems. Another advantage is that collaboration between individuals becomes possible regardless of their geographic location and background.

Here we present a novel, open-web application called Leukippos, which aims to apply these information and communication technologies to in silico synthetic biology projects. We describe both the underlying technology and organizational structure necessary for the platform’s operation. The synthetic biology software search engine, SynBioAppSelector, and the game, SynBrick, are examples of projects being developed on this platform.

Cloud-Based Collaboration Can Potentially Accelerate and Transform the Scientific Discovery Process

Social networks and the ability to organize collaborative work via the cloud will become important factors in driving innovations in modern science and technology.

This kind of social networking provides a potential frame for global connectivity among researchers. This dramatically expands our combined brain power because large groups of different people are more likely to find solutions to complex problems. Moreover, collaborations become independent of the physical location of the collaborators or the development level of the member countries. This independency of the physical location reduces the transaction costs nearly to zero.

Furthermore, such combined brainpower can help deal with another major challenge for contemporary science, the so-called Big Data problem. In recent years, scientific research has increasingly produced vast amounts of data from high-throughput or large-scale experiments. We can also observe an exponential increase in the number and/or size of data sets, particularly in biology and bioinformatics research. For example, the 1000 Genomes Project has produced 200 TB of publicly available data sets since its inception. The output in scientific literature has become so vast and complex that it has become difficult for a single person to read, assimilate, and process it. Social networks can help. For example, the use of Twitter analytics can point someone toward relevant recent publications.

This new organizational model of collaborative research is still in its infancy, and recently, different approaches to distributed problem solving have been gathering attention: examples include crowdsourcing, crowdfunding, and topic-specific science forums. Successful examples of such an effort include games like FoldIt and EteRNA, which gamify computational predictions of protein and RNA structures, respectively.

There are, however, a series of barriers that prevent online communities of scientists from adopting cloud-based collaboration. An inherent conservatism in established science praxis discourages many scientists to share data publicly. A large number of publications in high-impact journals are still essential for scientists to build a career. Going public with data and knowledge would mean giving their competitors an undue advantage. Thus, the ideas of open source are convincing in theory, but are often not put in praxis due to the traditional business model of academic science (see http://bit.ly/1iqYzl8).

Another major limitation in adopting distributed problem-solving approaches is the lack of a fully functional, complete, and self-contained infrastructure for collaboration. That is, an infrastructure that would allow access to data of any type, from anywhere, and by any collaborator. This would also allow processing and analysis of such data and results by anyone, from anywhere.

In Silico Synthetic Biology as an Example Area for Cloud Collaboration

Synthetic biology (SynBio) is a recently established, novel discipline that aims to design and engineer biological elements, circuits, devices, and systems not found in nature and redesign existing natural biological systems for useful purposes.

SynBio is especially well suited for cloud-based collaboration due to a specific social culture in the field and specific scientific needs. The specific culture of SynBio is characterized by a dominance of young, innovative people. Every year the International Genetically Engineered Machine competition engages students from around the world in synthetic biology projects. Moreover, there is a strong DIY/biohacker community interested in SynBio. These people are early adopters and are open to novel open source ideas and used to social networking via the cloud in their scientific work. Additionally, SynBio can be viewed as a practical application of systems biology because it deals with complex systems and large amount of data. As previously discussed, these kinds of problems are especially suited for cloud collaboration. Moreover, SynBio relies heavily on bioinformatics. Standardization of biological parts and subsequent usage of hierarchical abstraction to assemble complex systems are used extensively in SynBio designs. This makes SynBio especially suitable for digitization and computer-aided design software. Hence, in silico work is a crucial part of the daily work of the SynBiologist. Access to a laptop or a mobile device makes it possible to do essential work in this field. This technology makes it possible to organize an in silico SynBio lab in the cloud.

Components of the Leukippos Platform

The Leukippos Institute was founded to harness the power of the crowd and social media in order to collaboratively carry out bioinformatics work for solving non-trivial problems in synthetic biology. Thus, the aim of the Leukippos Institute is to build an in silico synthetic biology lab in the cloud. The output of our work will be different synthetic biology–related web applications.

This proposal embraces a native digital environment: the Internet. Since all the activity occurs on the Web, it is a perfect venue for in silico work and for professionals in different fields and sections of academia and different parts of the world to connect and collaborate. By adopting an open science model and harnessing the networking capabilities of the Web, Leukippos can harness the contributions, labor, and computing power of online volunteers. In this manner, the Leukippos Institute provides a way to take advantage of the computerized character of synthetic biology to produce an open network for collaborative work in the field.

The Leukippos Institute is based on the crowdsourcing concept, in which collaborators contribute their own expertise from their own areas of interest or research, and then get attributed for their contribution. Originally, some in silico SynBio projects were initiated by a small group of participants on Facebook. The Facebook group has since grown to 472 members. Facebook, in essence, is an integral part of this project in which we discuss ideas, methods, and solutions to problems.

In order to go beyond Facebook, we are now in the process of developing a platform that will serve as a synthetic biology lab in the cloud (see Figure 3-1).

The workflow of the coding platform of the Leukippos Institute. The various components of our platform are the following: (1) First, we use an SSH (Secure Shell) terminal on any computing device. (2) This is used to manage a T1 microinstance running Ubuntu 12.04.2LTS (Linux) on Amazon Web Services. (We are in addition using a server from the University of Windsor as an alternative to Amazon Web Services). (3) GitHub, which will be used as a repository, is where we store the versions of our project and can get easy access to the code under development. (4) We will use Heroku to host our web app. (5) Thus anybody participating in a specific project can work on his or her own version or part of the web app under development. (6) Facebook will be used to discuss the different versions of the web app and to agree on an official merged version.
Figure 3-1. The workflow of the coding platform of the Leukippos Institute. The various components of our platform are the following: (1) First, we use an SSH (Secure Shell) terminal on any computing device. (2) This is used to manage a T1 microinstance running Ubuntu 12.04.2LTS (Linux) on Amazon Web Services. (We are in addition using a server from the University of Windsor as an alternative to Amazon Web Services). (3) GitHub, which will be used as a repository, is where we store the versions of our project and can get easy access to the code under development. (4) We will use Heroku to host our web app. (5) Thus anybody participating in a specific project can work on his or her own version or part of the web app under development. (6) Facebook will be used to discuss the different versions of the web app and to agree on an official merged version.

Idea Testing: Projects and Testbeds

The Leukippos Institute has two ongoing projects that serve as testbeds for the crowdsourcing, collaborative methodology. The first is SynBio App Selector, an interactive repository of synthetic biology–related software (see Figure 3-2).

The SynBio App Selector (from http://www.iwbdaconf.org/2013/proceedings/) is a hierarchical structured web application that provides the user with an easy and intuitive way to find synthetic biology–related software. The user interface of the app consists of three different menus. These menus are rendered in the form of icons on a 3D sphere, and the user navigates them by dragging and zooming the sphere. The first of these menus displays a schematic representation of the Central Dogma of molecular biology and leads to software that works with the different molecules and processes involved. This ranges from DNA plasmid and RNA primer design to protein analysis. The other menus represent higher-order biological systems and other useful tools. A prototype of this app can be found at http://bit.ly/1qh0iQt.
Figure 3-2. The SynBio App Selector (from http://www.iwbdaconf.org/2013/proceedings/) is a hierarchical structured web application that provides the user with an easy and intuitive way to find synthetic biology–related software. The user interface of the app consists of three different menus. These menus are rendered in the form of icons on a 3D sphere, and the user navigates them by dragging and zooming the sphere. The first of these menus displays a schematic representation of the Central Dogma of molecular biology and leads to software that works with the different molecules and processes involved. This ranges from DNA plasmid and RNA primer design to protein analysis. The other menus represent higher-order biological systems and other useful tools. A prototype of this app can be found at http://bit.ly/1qh0iQt.

Synthetic biology is deeply embedded in modern Big Data science, and computational tools play a vital role. However, given the abundance and diversity of software available, it is often hard to find the right tool for the job. SynBio App Selector aims to solve this problem through an online quick reference guide app using HTML5, JavaScript, and other web technologies. The app categorizes synthetic biology software into different classes and subclasses, such as "lab tools," "simulations," or "primer design." Users can navigate these hierarchies by means of a 3D interactive display. The app stores information on each software tool’s description, development status, and licensing, as well as other pertinent info. In all, the app indexes over 180 different software tools. Figure 3-3 shows an early version of the app, which is still under development.

The SynBio App Selector is an intuitive-to-use, all-in-one collection of software applications, tutorials, and resources related to synthetic biology. Navigate the menus by dragging and scrolling up and down, and click on the icons to view a list of software belonging to that category.
Figure 3-3. The SynBio App Selector is an intuitive-to-use, all-in-one collection of software applications, tutorials, and resources related to synthetic biology. Navigate the menus by dragging and scrolling up and down, and click on the icons to view a list of software belonging to that category.

The second project under development at Leukippos is SynBrick (Figure 3-4), a crowdsourcing game in which players work together to solve engineering challenges using synthetic biology: designing biological systems to produce biofuels or medicines, diagnose diseases, or clean hazardous waste, to mention a few possibilities. SynBrick takes advantage of the concept of BioBricks, standardized genetic components that can be mixed and matched to build different biological systems, and is built on a similar modular scheme.

SynBrick (from http://www.iwbdaconf.org/2013/proceedings/) is a game played in teams where the aim is to solve complex synthetic biology problems. This figure shows SynBrick’s structure and problem-solving design strategy. Arrows denote flow of information. Standardized biological parts are the building blocks of the game.
Figure 3-4. SynBrick (from http://www.iwbdaconf.org/2013/proceedings/) is a game played in teams where the aim is to solve complex synthetic biology problems. This figure shows SynBrick’s structure and problem-solving design strategy. Arrows denote flow of information. Standardized biological parts are the building blocks of the game.

Complex problems like the ones outlined are decomposed into simpler tasks. For example, if we are building a system that turns water red when a pollutant is detected, we can break it down into two separate devices, one that detects the pollutant and another that produces red pigment. These simpler tasks are then solved by players (i.e., collaborators or any willing participants) who devise in silico biological parts (biogates, biocircuits, biosystems) called BioBricks; these BioBricks are further combined in an appropriate manner and in such a way that the simple tasks are solved. In SynBrick, players will be challenged to solve specific and simple tasks using a virtual BioBrick toolbox. The game evaluates the best solutions by simulating the genetic circuits built by players based on the characterization information available for each BioBrick in the Standard Registry of Biological Parts. The first version of SynBrick is still under development. However, you can read more on this project or SynBio App Selector in the Proceedings of the International Workshop on Bio-Design Automation 2013 (see page 64).

The authors thank Kevin Chen (McGill University) for critical reading of the paper and his valuable comments.

Correspondence to Dr. Gerd Moe-Behrens: .