Difference between revisions of "Team:Tsinghua/Design"

Line 1: Line 1:
 
{{Tsinghua}}
 
{{Tsinghua}}
<p align="center"><strong>Light-switchable two-component system</strong><br>
+
<p><strong>Brief Introduction</strong><br>
 +
  As the team iGEM Tsinghua 2015, we  established a biological information storage platform with visible lights as  input, and DNA sequences edited by modified recombinase as information stored.  A hardware with supporting software is developed to carry out the work with  genetically modified bacteria. Stored information is read out by the means of  DNA sequencing, which is then decoded by our software. By developing this  system, in the future one can easily store information from any file in the  computer or elsewhere into the bacteria mediated by light and read it out just  a click-away by sequencing.<br>
 +
  <strong>Light-switchable two-component system</strong><br>
 
   Nowadays, synthetic  photobiology has become a relatively mature field, within which scientists develop  light systems from all sorts of organisms and integrate them into bacterial  systems. Plus, different components and modules of light-responsive proteins  from different species have been engineered together to achieve highest  efficiency. Therefore, if we have to choose one form of signal as the input,  the optical input is favored. <br>
 
   Nowadays, synthetic  photobiology has become a relatively mature field, within which scientists develop  light systems from all sorts of organisms and integrate them into bacterial  systems. Plus, different components and modules of light-responsive proteins  from different species have been engineered together to achieve highest  efficiency. Therefore, if we have to choose one form of signal as the input,  the optical input is favored. <br>
 
   <strong>Advantages</strong><br>
 
   <strong>Advantages</strong><br>
Line 6: Line 8:
 
   <strong>Principle</strong><br>
 
   <strong>Principle</strong><br>
 
   Light-switchable two-component  system (TCS) is one example of how light signal can be wired into the metabolic  pathway within the bacteria. As is indicated by its nomenclature, this system  is switchable – it has two interchangeable states when stimulated by different  light conditions. Additionally, there are two components within: a light sensor  and a response regulator, the former sensing the incoming light and responding  to it by changing the conformation, the latter reacting to the sensor and  turning on or off the gene expression due to its transcriptional factor nature.  To be more specific, a light sensor is made up of two modules: an actual light  sensor and an effector which possesses both kinase and phosphatase activity. <br>
 
   Light-switchable two-component  system (TCS) is one example of how light signal can be wired into the metabolic  pathway within the bacteria. As is indicated by its nomenclature, this system  is switchable – it has two interchangeable states when stimulated by different  light conditions. Additionally, there are two components within: a light sensor  and a response regulator, the former sensing the incoming light and responding  to it by changing the conformation, the latter reacting to the sensor and  turning on or off the gene expression due to its transcriptional factor nature.  To be more specific, a light sensor is made up of two modules: an actual light  sensor and an effector which possesses both kinase and phosphatase activity. <br>
   The sensor and the effector  interact closely in order to give a precise light-induced response. The  principle behind light-switchable two-component system goes like this: When a  beam of light hits on the bacteria, the effector module in the sensor, i.e., the HK domain, will change its confirmation accordingly, therefore its catalytic activity transits from a phosphokinase to a phosphatase. Consequently, its target response regulator will be dephosphorylated and in turn inactivated. As a result, RR cannot recognize its downstream target sequence and cannot activate the expression of the reporter gene. In our system, three mainstream light systems we took advantage of all follow this basic scheme.<br>
+
   The sensor and the effector  interact closely in order to give a precise light-induced response. The  principle behind light-switchable two-component system goes like this: When a  beam of light hits on the bacteria, the effector module in the sensor, i.e.,  the HK domain, will change its confirmation accordingly, therefore its  catalytic activity transits from a phosphokinase to a phosphatase. Consequently, its target response regulator will be dephosphorylated and in turn inactivated. As a result, RR cannot recognize its downstream target sequence and cannot activate the expression of the reporter gene. In our system, three mainstream light systems we took advantage of all follow this basic scheme.<br>
   <strong>Classification</strong><br>
+
   <strong>Design</strong><br>
 
   Three types of TCS are now the  most commonly investigated, including red, blue and green light system, named  by at which wavelength the system is responsive.<br>
 
   Three types of TCS are now the  most commonly investigated, including red, blue and green light system, named  by at which wavelength the system is responsive.<br>
 
   The red-light system used in our project  consists of two components, a membrane-bound light sensor Cph8 and a response  regulator OmpR. The light sensor is made up of a red-light-sensitive  cyanobacterial phytochrome sensor module Phy derived from a protein called Cph1  from <em>S.</em> PCC 6803, and a histidine  kinase domain from a protein called EnvZ from <em>E. coli</em>. The response regulator is derived from OmpR  of which the recognition site is a promoter named OmpC. Red light will induce reversible  conformational switch in Cph8, leading to kinase activity loss. OmpR, as a  substrate of Cph8 kinase, will be dephosphorylated, which prevents it from  binding to OmpC promoter and driving the expression of genes downstream. Since  that there is an endogenous expression level of red-light system in <em>E. coli</em>, a bacterial knock-out technique  is introduce to avoid a potentially confusing result.<br>
 
   The red-light system used in our project  consists of two components, a membrane-bound light sensor Cph8 and a response  regulator OmpR. The light sensor is made up of a red-light-sensitive  cyanobacterial phytochrome sensor module Phy derived from a protein called Cph1  from <em>S.</em> PCC 6803, and a histidine  kinase domain from a protein called EnvZ from <em>E. coli</em>. The response regulator is derived from OmpR  of which the recognition site is a promoter named OmpC. Red light will induce reversible  conformational switch in Cph8, leading to kinase activity loss. OmpR, as a  substrate of Cph8 kinase, will be dephosphorylated, which prevents it from  binding to OmpC promoter and driving the expression of genes downstream. Since  that there is an endogenous expression level of red-light system in <em>E. coli</em>, a bacterial knock-out technique  is introduce to avoid a potentially confusing result.<br>
 
   The blue light system follows  similar principles, containing two components as well. It is also a protein  hybrid that is made up of modules from different species. The  blue-light-sensitive LOV domain in its soluble light sensor YF1 is derived from  a protein termed YtvA from <em>B. subtilis</em>,  whereas the histidine kinase domain derived from the protein FixL and the  response regulator FixJ are found <em>B.  japonicum</em>. In this system, a Jα chain is introduced to link the light  sensor and effector together, of which the conformation change is induced, switching  the YF1 (the fusion protein) from a kinase to a phosphatase. Thus, the response  regulator, FixJ, is dephosphorylated and deprived of the ability to drive FixK2-promotor-regulated  gene expression.<br>
 
   The blue light system follows  similar principles, containing two components as well. It is also a protein  hybrid that is made up of modules from different species. The  blue-light-sensitive LOV domain in its soluble light sensor YF1 is derived from  a protein termed YtvA from <em>B. subtilis</em>,  whereas the histidine kinase domain derived from the protein FixL and the  response regulator FixJ are found <em>B.  japonicum</em>. In this system, a Jα chain is introduced to link the light  sensor and effector together, of which the conformation change is induced, switching  the YF1 (the fusion protein) from a kinase to a phosphatase. Thus, the response  regulator, FixJ, is dephosphorylated and deprived of the ability to drive FixK2-promotor-regulated  gene expression.<br>
 
   The green-light system works an  extremely similar way to that of the blue light system: it is comprised of two  essential components, a light sensor and a response regulator. Here, however,  the light sensor module is designated as Cyb, along with its histidine kinase,  constituting the light sensor component CcaS. Its response regulator is called  CcaR, recognizing PcpcG2 promoter and in turn regulator its downstream genes.  These are constituents of cyanobacteriochromes.<br>
 
   The green-light system works an  extremely similar way to that of the blue light system: it is comprised of two  essential components, a light sensor and a response regulator. Here, however,  the light sensor module is designated as Cyb, along with its histidine kinase,  constituting the light sensor component CcaS. Its response regulator is called  CcaR, recognizing PcpcG2 promoter and in turn regulator its downstream genes.  These are constituents of cyanobacteriochromes.<br>
   <strong>dCas9-recombinase system</strong><br>
+
   <strong>dCas9-recombinase system</strong><br>
 
   There are two commonly used gene-editing  tools: site-specific recombinase and CRISPR-Cas9 system. <br>
 
   There are two commonly used gene-editing  tools: site-specific recombinase and CRISPR-Cas9 system. <br>
 
   Site-specific recombinase is an endonuclease  that is capable of inserting, deleting and inverting a DNA fragment within the  recognition site. Generally, two families of recombinase have been identified:  the tyrosine recombinase and the serine recombinase. Though one particular  outcome of recombination, be it inserting, deleting or inverting, is preferred  in different organisms, other editing modes can also been selected when  arbitrarily manipulated. As a result, a recombinase system is the most ideal  candidate when looking for an information storing executor. <br>
 
   Site-specific recombinase is an endonuclease  that is capable of inserting, deleting and inverting a DNA fragment within the  recognition site. Generally, two families of recombinase have been identified:  the tyrosine recombinase and the serine recombinase. Though one particular  outcome of recombination, be it inserting, deleting or inverting, is preferred  in different organisms, other editing modes can also been selected when  arbitrarily manipulated. As a result, a recombinase system is the most ideal  candidate when looking for an information storing executor. <br>
  Recombinases were previously utilized to accomplish  information storage in biological systems due to its specificity. However, they  bind unique recognition sites, and are thus limited in this respect. It is  exactly its specificity that disfavors this approach. In other words, a major  drawback of this information-storing platform is that every time a new  recombinase has to be used when increasing the storing capability. Finding a  new recombinase that suits the need, however, is computationally heavy. We then  decided to seek help from other gene-editing tool.   <br>
+
   Cas9, an endonuclease from <em>Streptococcus pyogenes</em>, can target and  cleave specific DNA sequences that are next to the proto-spacer adjacent motif  (PAM) when provided with a guide RNA. With the advancement of gene editing  technology, today CRISPR/Cas9 system has been exploited to carry out a myriad  of functions, such as knock-out and knock-down of a certain gene, single molecule imaging, etc. The list goes on. Of course, it is easy to understand why we then turn to Cas9 and see if it can overcome the specificity issue from recombinase issue.<br>
   Cas9, an endonuclease from <em>Streptococcus pyogenes</em>, can target and  cleave specific DNA sequences that are next to the proto-spacer adjacent motif  (PAM) when provided with a guide RNA. With the advancement of gene editing  technology, today CRISPR/Cas9 system has been exploited to carry out a myriad  of functions, such as knock-out and knock-down of a certain gene, single molecule imaging, etc. The list goes on. Of course, it is easy to understand why we then turn to Cas9 and see if it can overcome the specificity issue from recombinase issue.<br>
+
  Recombinases were previously utilized to accomplish  information storage in biological systems due to its specificity. However, they  bind unique recognition sites, and are thus limited in this respect. It is  exactly its specificity that disfavors this approach. In other words, a major  drawback of this information-storing platform is that every time a new  recombinase has to be used when increasing the storing capability. Finding a  new recombinase that suits the need, however, is computationally heavy. We then  decided to seek help from other gene-editing tool.   <br>
   CRISPR-Cas9 system is a newly developed gene-editing tool that breaks the limit of specific recognition sites. Following the guidance of sgRNAs, Cas9 endonuclease can be used to modify any site of the genome conveniently. Consequently it is regarded as a complementary DNA cutter that is not restricted to recognize unique sequences, but is versatile that can recognize any sequence within the genome guided by its sgRNA. This means that if current information storage capacity is not enough, we do not need to search for a new recombinase, instead changing the sgRNA pairs can solve the problem. However, accurate deletion or inversion, a vital aspect to consider when devising an information storing platform, are hard to accomplish because of double-strand breaks introduced by Cas9 endonuclease. An outstanding feature of tyrosine recombinases is that they do not introduce double-strand breaks that might cause unwanted consequences. Instead, in the process a cross-strand intermediate called Holliday Junction is formed. That is to say, we still count on the specificity and accuracy of recombinase, but meanwhile we need the assistance from Cas9.<br>
+
   CRISPR-Cas9 system is a newly developed gene-editing tool that breaks the limit of specific recognition sites. Following the guidance of sgRNAs, Cas9 endonuclease can be used to modify any site of the genome conveniently. Consequently it is regarded as a complementary DNA cutter that is not restricted to recognize unique sequences, but is versatile that can recognize any sequence within the genome guided by its sgRNA. This means that if current information storage capacity is not enough, we do not need to search for a new recombinase, instead changing the sgRNA pairs can solve the problem. However, accurate deletion or inversion, a vital aspect to consider when devising an information storing platform, are hard to accomplish because of double-strand breaks introduced by Cas9 endonuclease. An outstanding feature of tyrosine recombinases is that they do not introduce double-strand breaks that might cause unwanted consequences. Instead, in the process a cross-strand intermediate called Holliday Junction is formed. That is to say, we still count on the specificity and accuracy of recombinase, but meanwhile we need the assistance from Cas9.<br>
 
   Therefore,  we decided to combine the two. We deleted the DNA binding domain of Bxb1 and  Flp, fused their catalytic domain and dimerization domain with dCas9, whose  endonuclease activity was lost. A pair of sgRNAs targeting the sense and  antisense strand of the DNA sequence guide the fusion protein to the location,  and the recombinase part of the fusion protein carries out deletion or  inversion of the DNA segment. With this new tool we don&rsquo;t need to bother  considering and introducing recognition sites of the recombinases anymore.  Also, without causing double-strand breaks, gene editing can be safer and more  accurate.<br>
 
   Therefore,  we decided to combine the two. We deleted the DNA binding domain of Bxb1 and  Flp, fused their catalytic domain and dimerization domain with dCas9, whose  endonuclease activity was lost. A pair of sgRNAs targeting the sense and  antisense strand of the DNA sequence guide the fusion protein to the location,  and the recombinase part of the fusion protein carries out deletion or  inversion of the DNA segment. With this new tool we don&rsquo;t need to bother  considering and introducing recognition sites of the recombinases anymore.  Also, without causing double-strand breaks, gene editing can be safer and more  accurate.<br>
   To  achieve highest efficiency, we need to consider two aspects of optimization –  linker design and distance between sgRNA pairs. We built an inducible ccdB  screening system for it. ccdB is a lethal protein. Addition of iPTG induces  expression of ccdB and then bacteria get killed, but if ccdB gene is successfully disrupted by dCas9 recombinase, then the bacteria survive. Along ccdB&rsquo;s sequence we designed 25 and 24 sgRNAs targeting each strand of the DNA. Then there are altogether 600 pairs of sgRNAs with minimal distance of 0 bp and  maximal 700. Effects of different designs of linkers are also tested using this  inducible ccdB system.<br>
+
   To  achieve highest efficiency, we need to consider two aspects of optimization –  linker design and distance between sgRNA pairs. We built an inducible ccdB  screening system for it. ccdB is a lethal protein. Addition of iPTG induces  expression of ccdB and then bacteria get killed, but if ccdB gene is successfully disrupted by dCas9 recombinase, then the bacteria survive. Along ccdB&rsquo;s sequence we designed 25 and 24 sgRNAs targeting each strand of the DNA. Then there are altogether 600 pairs of sgRNAs with minimal distance of 0 bp and  maximal 700. Effects of different designs of linkers are also tested using this  inducible ccdB system.<br>
 
   We  established a model of the inducible ccdB system indicating the relationship  between iPTG addition and OD value after discussing parameters of iPTG, ccdB,  relative concentration of bacteria, and OD value measured. It turns out that  the inducible ccdB system works perfectly well, so it is competent to be used  to screen the optimal distance between sgRNA pairs and an appropriate linker  choice. In addition, the inducible ccdB system has other potential future  applications.<br>
 
   We  established a model of the inducible ccdB system indicating the relationship  between iPTG addition and OD value after discussing parameters of iPTG, ccdB,  relative concentration of bacteria, and OD value measured. It turns out that  the inducible ccdB system works perfectly well, so it is competent to be used  to screen the optimal distance between sgRNA pairs and an appropriate linker  choice. In addition, the inducible ccdB system has other potential future  applications.<br>
   Although  the screening work wasn&rsquo;t finished, a linker choice and distance between two  sgRNAs worked finely. These were tested with another system concerning BFP  expression.</p>
+
   Although  the screening work wasn&rsquo;t finished, a linker choice and distance between two  sgRNAs worked finely. These were tested with another system concerning BFP  expression.<br>
 +
  After successfully constructing all the systems required and  confirming its efficacy, we can bridge the light-switchable TCS and the  dCas9-recombines system together. In this way, precise gene editing and  information storing can be achieved by utilizing the light system to regulate  the dCas9-recombinase hybrid. </p>
 +
<p align="left">n into E.  coli by combining the light system with the gene editing tools. We took  advantage of the high precision and programmability of light system and the  specificity and the convenience from a Cas9-recombinase hybrid. In order to  build an information storage platform described above we devised a hardware  assisted by a software that can eventually convert any form of profile into  biological meaningful information. <br>
 +
  For  the light system we selected light-switchable two-component systems as the  signal input, and intended to rely on three commonly used ones: red, blue, and  green. We adapted an engineering strategy onto these two component system by  combining different modules and components from different species in order to  achieve highest efficiency.<br>
 +
  For  the Cas9-recombinsae system we selected recombinase system as the tool to edit  the gene due to its specificity for consensus sequences. Yet it is this  advantage that limit its application because it is not convenient for upgrading  the storage capacity. We therefore complemented this system by utilizing the  CRISPR/Cas9 system, because it is guided by a sgRNA pair that is not limited to  specific sequences. Minor changes, however, have been made to render it more  applicable.<br>
 +
  Given  the ideas come up with above, how can we put all parts together in order to  store information within the E. coli? A straightforward strategy is to use  light-switchable two-component system to directly control the gene-editing  hybrid. This is the basic philosophy behind our information storage platform.  For example, we can denote the blue-light system to control information  containing &ldquo;0&rdquo; whereas the red-light system to control information containing  &ldquo;1&rdquo;. Green-light system do not represent none of two types of binary  information, instead it acts as a license that allows the recombines to work.</p>
 +
<p><strong>What we have done?</strong><br>
 +
  In  order to utilize the light as an input signal, we have to first test its basic  parameters which can be refer to. That why we first constructed several  plasmids for measurement. Two types of experiments were done to fulfill this  need: a qualitative one and a quantitative one. We received quite convincing  results to support that light-switchable two-component system can work  successfully in E. coli. We additionally built a model of the relationship  between the light input and the protein expression output based on previous  results.<br>
 +
  For  the Cas9-recombinase system, we designed an iPTG-inducible ccdB screening  system to test whether its gene editing ability is powerful or not. Eventually  600 possibilities of sgRNA combination can be tested using this screening  strategy. Using this screening system, we can also determine the optimal  distance between sgRNA pairs and length of the linker. All being said, we still  needed to first determine the basic parameter of this inducible system. Again,  qualitative and quantitative experiments are carried out, turning out to be promising  to ensure that inducible system can work successfully in E. coli. Models  discussing the relationship between the concentration of added iPTG and optical  density value of the bacteria culture are built.<br>
 +
  With  two systems measured, now it is time to combine the two together. To cater to  this need, we devised a hardware that can instantaneously emits light signals  in massive parallel onto the bacteria. With the assistance from the software,  we can either convert a file into the binary data string which can be  transformed to a light emitting pattern with a coding protocol (a pre-programmed  grammar), in turn being encoded into the bacteria by a modified recombinase, or  we can put in light parameters and encrypt the information into the bacteria. &nbsp;<br>
 +
  The  E-light 1.0 hardware system has 3 major components: the light-exposure &amp; bacterial  culture system, the controlling circuit and the computer interacting port. The  light-exposure &amp; bacterial culture system is based on a 24-well plate  coupled with tri-color LEDs. The controlling circuit utilizes 3 AT89S52-24PU  DIP-40 SCMs (single chip microcomputer) to execute programmed-controlling of  the 24 tri-color LEDs, while the computer interacting port monitors the whole  system through given protocol sequences. The ultimate result is the  programmable operation and real-time monitoring of light-exposure (on both  timing and wave-length) on every single well.<br>
 +
The  E-code 1.0 software system aims to provide convenient commanding for users of  the E-light hardware system. The software provides two operating modes: the  E.coli-code mode is able to convert any given information into light-coded  files, and therefore turn these files into actual light-exposure commands of  the E-light hardware system. With the help of the coding-plasmids from our  CRISPR-Recombinase system, we can eventually store any information into the  E.coli DNA and of course, extract the information later on through sequencing.  The self-code mode provides more flexible input options, enabling users to  program the light-exposure commands manually for every single  bacterial-culture-unit. Thus, combined with our light-switch, the user is able  to gain better control over the bacteria&rsquo;s metabolism pathways.</p>
 +
<p><strong>What we can do in the future?</strong></p>
 +
 
  
 
<html>
 
<html>

Revision as of 22:44, 18 September 2015

Brief Introduction
As the team iGEM Tsinghua 2015, we established a biological information storage platform with visible lights as input, and DNA sequences edited by modified recombinase as information stored. A hardware with supporting software is developed to carry out the work with genetically modified bacteria. Stored information is read out by the means of DNA sequencing, which is then decoded by our software. By developing this system, in the future one can easily store information from any file in the computer or elsewhere into the bacteria mediated by light and read it out just a click-away by sequencing.
Light-switchable two-component system
Nowadays, synthetic photobiology has become a relatively mature field, within which scientists develop light systems from all sorts of organisms and integrate them into bacterial systems. Plus, different components and modules of light-responsive proteins from different species have been engineered together to achieve highest efficiency. Therefore, if we have to choose one form of signal as the input, the optical input is favored.
Advantages
Using light as an input signal has obvious advantages. First, it has extremely high spatial and temporal precision, unlike small chemical molecules which can be diffusible and will be diluted when bacterial proliferate and culture medium is changed. Second, easy access and low cost renders light system frequently used. For example, a light-emitting diode (LED) usually costs less than 10 cent. Third, optical stimulation is noninvasive and mild, unlike thermal, mechanical and chemical stimulation that might potentially put the bacteria in jeopardy. Its minimal off-pathway effect is also a must when considering arbitrarily adding light-responsive elements into the bacteria. Forth, it is potentially orthogonal and programmable. Different light systems generally do not interfere with each other, and therefore can be stimulated and silenced in parallel.
Principle
Light-switchable two-component system (TCS) is one example of how light signal can be wired into the metabolic pathway within the bacteria. As is indicated by its nomenclature, this system is switchable – it has two interchangeable states when stimulated by different light conditions. Additionally, there are two components within: a light sensor and a response regulator, the former sensing the incoming light and responding to it by changing the conformation, the latter reacting to the sensor and turning on or off the gene expression due to its transcriptional factor nature. To be more specific, a light sensor is made up of two modules: an actual light sensor and an effector which possesses both kinase and phosphatase activity.
The sensor and the effector interact closely in order to give a precise light-induced response. The principle behind light-switchable two-component system goes like this: When a beam of light hits on the bacteria, the effector module in the sensor, i.e., the HK domain, will change its confirmation accordingly, therefore its catalytic activity transits from a phosphokinase to a phosphatase. Consequently, its target response regulator will be dephosphorylated and in turn inactivated. As a result, RR cannot recognize its downstream target sequence and cannot activate the expression of the reporter gene. In our system, three mainstream light systems we took advantage of all follow this basic scheme.
Design
Three types of TCS are now the most commonly investigated, including red, blue and green light system, named by at which wavelength the system is responsive.
The red-light system used in our project consists of two components, a membrane-bound light sensor Cph8 and a response regulator OmpR. The light sensor is made up of a red-light-sensitive cyanobacterial phytochrome sensor module Phy derived from a protein called Cph1 from S. PCC 6803, and a histidine kinase domain from a protein called EnvZ from E. coli. The response regulator is derived from OmpR of which the recognition site is a promoter named OmpC. Red light will induce reversible conformational switch in Cph8, leading to kinase activity loss. OmpR, as a substrate of Cph8 kinase, will be dephosphorylated, which prevents it from binding to OmpC promoter and driving the expression of genes downstream. Since that there is an endogenous expression level of red-light system in E. coli, a bacterial knock-out technique is introduce to avoid a potentially confusing result.
The blue light system follows similar principles, containing two components as well. It is also a protein hybrid that is made up of modules from different species. The blue-light-sensitive LOV domain in its soluble light sensor YF1 is derived from a protein termed YtvA from B. subtilis, whereas the histidine kinase domain derived from the protein FixL and the response regulator FixJ are found B. japonicum. In this system, a Jα chain is introduced to link the light sensor and effector together, of which the conformation change is induced, switching the YF1 (the fusion protein) from a kinase to a phosphatase. Thus, the response regulator, FixJ, is dephosphorylated and deprived of the ability to drive FixK2-promotor-regulated gene expression.
The green-light system works an extremely similar way to that of the blue light system: it is comprised of two essential components, a light sensor and a response regulator. Here, however, the light sensor module is designated as Cyb, along with its histidine kinase, constituting the light sensor component CcaS. Its response regulator is called CcaR, recognizing PcpcG2 promoter and in turn regulator its downstream genes. These are constituents of cyanobacteriochromes.
dCas9-recombinase system
There are two commonly used gene-editing tools: site-specific recombinase and CRISPR-Cas9 system.
Site-specific recombinase is an endonuclease that is capable of inserting, deleting and inverting a DNA fragment within the recognition site. Generally, two families of recombinase have been identified: the tyrosine recombinase and the serine recombinase. Though one particular outcome of recombination, be it inserting, deleting or inverting, is preferred in different organisms, other editing modes can also been selected when arbitrarily manipulated. As a result, a recombinase system is the most ideal candidate when looking for an information storing executor.
Cas9, an endonuclease from Streptococcus pyogenes, can target and cleave specific DNA sequences that are next to the proto-spacer adjacent motif (PAM) when provided with a guide RNA. With the advancement of gene editing technology, today CRISPR/Cas9 system has been exploited to carry out a myriad of functions, such as knock-out and knock-down of a certain gene, single molecule imaging, etc. The list goes on. Of course, it is easy to understand why we then turn to Cas9 and see if it can overcome the specificity issue from recombinase issue.
Recombinases were previously utilized to accomplish information storage in biological systems due to its specificity. However, they bind unique recognition sites, and are thus limited in this respect. It is exactly its specificity that disfavors this approach. In other words, a major drawback of this information-storing platform is that every time a new recombinase has to be used when increasing the storing capability. Finding a new recombinase that suits the need, however, is computationally heavy. We then decided to seek help from other gene-editing tool.  
CRISPR-Cas9 system is a newly developed gene-editing tool that breaks the limit of specific recognition sites. Following the guidance of sgRNAs, Cas9 endonuclease can be used to modify any site of the genome conveniently. Consequently it is regarded as a complementary DNA cutter that is not restricted to recognize unique sequences, but is versatile that can recognize any sequence within the genome guided by its sgRNA. This means that if current information storage capacity is not enough, we do not need to search for a new recombinase, instead changing the sgRNA pairs can solve the problem. However, accurate deletion or inversion, a vital aspect to consider when devising an information storing platform, are hard to accomplish because of double-strand breaks introduced by Cas9 endonuclease. An outstanding feature of tyrosine recombinases is that they do not introduce double-strand breaks that might cause unwanted consequences. Instead, in the process a cross-strand intermediate called Holliday Junction is formed. That is to say, we still count on the specificity and accuracy of recombinase, but meanwhile we need the assistance from Cas9.
Therefore, we decided to combine the two. We deleted the DNA binding domain of Bxb1 and Flp, fused their catalytic domain and dimerization domain with dCas9, whose endonuclease activity was lost. A pair of sgRNAs targeting the sense and antisense strand of the DNA sequence guide the fusion protein to the location, and the recombinase part of the fusion protein carries out deletion or inversion of the DNA segment. With this new tool we don’t need to bother considering and introducing recognition sites of the recombinases anymore. Also, without causing double-strand breaks, gene editing can be safer and more accurate.
To achieve highest efficiency, we need to consider two aspects of optimization – linker design and distance between sgRNA pairs. We built an inducible ccdB screening system for it. ccdB is a lethal protein. Addition of iPTG induces expression of ccdB and then bacteria get killed, but if ccdB gene is successfully disrupted by dCas9 recombinase, then the bacteria survive. Along ccdB’s sequence we designed 25 and 24 sgRNAs targeting each strand of the DNA. Then there are altogether 600 pairs of sgRNAs with minimal distance of 0 bp and maximal 700. Effects of different designs of linkers are also tested using this inducible ccdB system.
We established a model of the inducible ccdB system indicating the relationship between iPTG addition and OD value after discussing parameters of iPTG, ccdB, relative concentration of bacteria, and OD value measured. It turns out that the inducible ccdB system works perfectly well, so it is competent to be used to screen the optimal distance between sgRNA pairs and an appropriate linker choice. In addition, the inducible ccdB system has other potential future applications.
Although the screening work wasn’t finished, a linker choice and distance between two sgRNAs worked finely. These were tested with another system concerning BFP expression.
After successfully constructing all the systems required and confirming its efficacy, we can bridge the light-switchable TCS and the dCas9-recombines system together. In this way, precise gene editing and information storing can be achieved by utilizing the light system to regulate the dCas9-recombinase hybrid.

n into E. coli by combining the light system with the gene editing tools. We took advantage of the high precision and programmability of light system and the specificity and the convenience from a Cas9-recombinase hybrid. In order to build an information storage platform described above we devised a hardware assisted by a software that can eventually convert any form of profile into biological meaningful information.
For the light system we selected light-switchable two-component systems as the signal input, and intended to rely on three commonly used ones: red, blue, and green. We adapted an engineering strategy onto these two component system by combining different modules and components from different species in order to achieve highest efficiency.
For the Cas9-recombinsae system we selected recombinase system as the tool to edit the gene due to its specificity for consensus sequences. Yet it is this advantage that limit its application because it is not convenient for upgrading the storage capacity. We therefore complemented this system by utilizing the CRISPR/Cas9 system, because it is guided by a sgRNA pair that is not limited to specific sequences. Minor changes, however, have been made to render it more applicable.
Given the ideas come up with above, how can we put all parts together in order to store information within the E. coli? A straightforward strategy is to use light-switchable two-component system to directly control the gene-editing hybrid. This is the basic philosophy behind our information storage platform. For example, we can denote the blue-light system to control information containing “0” whereas the red-light system to control information containing “1”. Green-light system do not represent none of two types of binary information, instead it acts as a license that allows the recombines to work.

What we have done?
In order to utilize the light as an input signal, we have to first test its basic parameters which can be refer to. That why we first constructed several plasmids for measurement. Two types of experiments were done to fulfill this need: a qualitative one and a quantitative one. We received quite convincing results to support that light-switchable two-component system can work successfully in E. coli. We additionally built a model of the relationship between the light input and the protein expression output based on previous results.
For the Cas9-recombinase system, we designed an iPTG-inducible ccdB screening system to test whether its gene editing ability is powerful or not. Eventually 600 possibilities of sgRNA combination can be tested using this screening strategy. Using this screening system, we can also determine the optimal distance between sgRNA pairs and length of the linker. All being said, we still needed to first determine the basic parameter of this inducible system. Again, qualitative and quantitative experiments are carried out, turning out to be promising to ensure that inducible system can work successfully in E. coli. Models discussing the relationship between the concentration of added iPTG and optical density value of the bacteria culture are built.
With two systems measured, now it is time to combine the two together. To cater to this need, we devised a hardware that can instantaneously emits light signals in massive parallel onto the bacteria. With the assistance from the software, we can either convert a file into the binary data string which can be transformed to a light emitting pattern with a coding protocol (a pre-programmed grammar), in turn being encoded into the bacteria by a modified recombinase, or we can put in light parameters and encrypt the information into the bacteria.  
The E-light 1.0 hardware system has 3 major components: the light-exposure & bacterial culture system, the controlling circuit and the computer interacting port. The light-exposure & bacterial culture system is based on a 24-well plate coupled with tri-color LEDs. The controlling circuit utilizes 3 AT89S52-24PU DIP-40 SCMs (single chip microcomputer) to execute programmed-controlling of the 24 tri-color LEDs, while the computer interacting port monitors the whole system through given protocol sequences. The ultimate result is the programmable operation and real-time monitoring of light-exposure (on both timing and wave-length) on every single well.
The E-code 1.0 software system aims to provide convenient commanding for users of the E-light hardware system. The software provides two operating modes: the E.coli-code mode is able to convert any given information into light-coded files, and therefore turn these files into actual light-exposure commands of the E-light hardware system. With the help of the coding-plasmids from our CRISPR-Recombinase system, we can eventually store any information into the E.coli DNA and of course, extract the information later on through sequencing. The self-code mode provides more flexible input options, enabling users to program the light-exposure commands manually for every single bacterial-culture-unit. Thus, combined with our light-switch, the user is able to gain better control over the bacteria’s metabolism pathways.

What we can do in the future?


________________________________________________________________________________________________________________________