Team:Tsinghua/Description
Brief Introduction
As the team iGEM Tsinghua 2015, we established a biological information storage platform with visible lights as input, and DNA sequences edited by modified recombinase as information stored. A hardware with supporting software is developed to carry out the work with genetically modified bacteria. Stored information is read out by the means of DNA sequencing, which is then decoded by our software. By developing this system, in the future one can easily store information from any file in the computer or elsewhere into the bacteria mediated by light and read it out just a click-away by sequencing.
Light-switchable two-component system
Nowadays, synthetic photobiology has become a relatively mature field, within which scientists develop light systems from all sorts of organisms and integrate them into bacterial systems. Plus, different components and modules of light-responsive proteins from different species have been engineered together to achieve highest efficiency. Therefore, if we have to choose one form of signal as the input, the optical input is favored.
Advantages
Using light as an input signal has obvious advantages. First, it has extremely high spatial and temporal precision, unlike small chemical molecules which can be diffusible and will be diluted when bacterial proliferate and culture medium is changed. Second, easy access and low cost renders light system frequently used. For example, a light-emitting diode (LED) usually costs less than 10 cent. Third, optical stimulation is noninvasive and mild, unlike thermal, mechanical and chemical stimulation that might potentially put the bacteria in jeopardy. Its minimal off-pathway effect is also a must when considering arbitrarily adding light-responsive elements into the bacteria. Forth, it is potentially orthogonal and programmable. Different light systems generally do not interfere with each other, and therefore can be stimulated and silenced in parallel.
Principle
Light-switchable two-component system (TCS) is one example of how light signal can be wired into the metabolic pathway within the bacteria. As is indicated by its nomenclature, this system is switchable – it has two interchangeable states when stimulated by different light conditions. Additionally, there are two components within: a light sensor and a response regulator, the former sensing the incoming light and responding to it by changing the conformation, the latter reacting to the sensor and turning on or off the gene expression due to its transcriptional factor nature. To be more specific, a light sensor is made up of two modules: an actual light sensor and an effector which possesses both kinase and phosphatase activity.
The sensor and the effector interact closely in order to give a precise light-induced response. The principle behind light-switchable two-component system goes like this: When a beam of light hits on the bacteria, the effector module in the sensor, i.e., the HK domain, will change its confirmation accordingly, therefore its catalytic activity transits from a phosphokinase to a phosphatase. Consequently, its target response regulator will be dephosphorylated and in turn inactivated. As a result, RR cannot recognize its downstream target sequence and cannot activate the expression of the reporter gene. In our system, three mainstream light systems we took advantage of all follow this basic scheme.
Classification
Three types of TCS are now the most commonly investigated, including red, blue and green light system, named by at which wavelength the system is responsive.
The red-light system used in our project consists of two components, a membrane-bound light sensor Cph8 and a response regulator OmpR. The light sensor is made up of a red-light-sensitive cyanobacterial phytochrome sensor module Phy derived from a protein called Cph1 from S. PCC 6803, and a histidine kinase domain from a protein called EnvZ from E. coli. The response regulator is derived from OmpR of which the recognition site is a promoter named OmpC. Red light will induce reversible conformational switch in Cph8, leading to kinase activity loss. OmpR, as a substrate of Cph8 kinase, will be dephosphorylated, which prevents it from binding to OmpC promoter and driving the expression of genes downstream. Since that there is an endogenous expression level of red-light system in E. coli, a bacterial knock-out technique is introduce to avoid a potentially confusing result.
The blue light system follows similar principles, containing two components as well. It is also a protein hybrid that is made up of modules from different species. The blue-light-sensitive LOV domain in its soluble light sensor YF1 is derived from a protein termed YtvA from B. subtilis, whereas the histidine kinase domain derived from the protein FixL and the response regulator FixJ are found B. japonicum. In this system, a Jα chain is introduced to link the light sensor and effector together, of which the conformation change is induced, switching the YF1 (the fusion protein) from a kinase to a phosphatase. Thus, the response regulator, FixJ, is dephosphorylated and deprived of the ability to drive FixK2-promotor-regulated gene expression.
The green-light system works an extremely similar way to that of the blue light system: it is comprised of two essential components, a light sensor and a response regulator. Here, however, the light sensor module is designated as Cyb, along with its histidine kinase, constituting the light sensor component CcaS. Its response regulator is called CcaR, recognizing PcpcG2 promoter and in turn regulator its downstream genes. These are constituents of cyanobacteriochromes.
dCas9-recombinase system
There are two commonly used gene-editing tools: site-specific recombinase and CRISPR-Cas9 system.
Site-specific recombinase is an endonuclease that is capable of inserting, deleting and inverting a DNA fragment within the recognition site. Generally, two families of recombinase have been identified: the tyrosine recombinase and the serine recombinase. Though one particular outcome of recombination, be it inserting, deleting or inverting, is preferred in different organisms, other editing modes can also been selected when arbitrarily manipulated. As a result, a recombinase system is the most ideal candidate when looking for an information storing executor.
Recombinases were previously utilized to accomplish information storage in biological systems due to its specificity. However, they bind unique recognition sites, and are thus limited in this respect. It is exactly its specificity that disfavors this approach. In other words, a major drawback of this information-storing platform is that every time a new recombinase has to be used when increasing the storing capability. Finding a new recombinase that suits the need, however, is computationally heavy. We then decided to seek help from other gene-editing tool.
Cas9, an endonuclease from Streptococcus pyogenes, can target and cleave specific DNA sequences that are next to the proto-spacer adjacent motif (PAM) when provided with a guide RNA. With the advancement of gene editing technology, today CRISPR/Cas9 system has been exploited to carry out a myriad of functions, such as knock-out and knock-down of a certain gene, single molecule imaging, etc. The list goes on. Of course, it is easy to understand why we then turn to Cas9 and see if it can overcome the specificity issue from recombinase issue.
CRISPR-Cas9 system is a newly developed gene-editing tool that breaks the limit of specific recognition sites. Following the guidance of sgRNAs, Cas9 endonuclease can be used to modify any site of the genome conveniently. Consequently it is regarded as a complementary DNA cutter that is not restricted to recognize unique sequences, but is versatile that can recognize any sequence within the genome guided by its sgRNA. This means that if current information storage capacity is not enough, we do not need to search for a new recombinase, instead changing the sgRNA pairs can solve the problem. However, accurate deletion or inversion, a vital aspect to consider when devising an information storing platform, are hard to accomplish because of double-strand breaks introduced by Cas9 endonuclease. That is to say, we still count on the specificity and accuracy of recombinase, but meanwhile we need the assistance from Cas9.