Revision as of 13:49, 4 September 2015

EPFL 2015 iGEM bioLogic Logic Orthogonal gRNA Implemented Circuits EPFL 2015 iGEM bioLogic Logic Orthogonal gRNA Implemented Circuits

Modeling

Kinetic model

In our project, multiple transistor elements are assembled to create logic gates. We envision the chaining of such gates in order to create complex logic circuits within cells. Predicting the behaviour of these complex cascades of reactions and way they reach a stationary state can be challenging. Modeling here represents the best way to understand our system and quantify the influence of various biochemical parameters. On the long run, modeling may also help fine-tune the design of biological systems and wet lab experiments.

We attempted to model the dynamics and interactions our system's components to predict the temporal response of our biological circuits. In this kinetic model, time dependency of the concentration of different species is taken into account explicitly.

Assumptions

Modelers of biophysical system often have a difficult mission. They must strike a compromise between developing a very detailed and accurate model and producing a simplistic one that may be easily analysed and compared to experimental data.

Developing model implies making a certain number of simplifications and assumptions to enable it to conform to reality. In this section we will keep track of the assumptions we made in order to clarify our model. Since the same system may be modeled using different approaches, we will justify our choices and set of assumptions.

The most important assumption underlying our kinetic model is the fact that the concentration of a given species does not depends on spatial coordinates, i.e. it is the same in every region of the cell. This assumption is quite bold but is a common approach in the literature and usually produces accurate results. The validity of this assumption is limited by the small number of molecules present in the cell and localization in membranes or particular organels. In some cases it is necessary, it may be necessary to consider the stochastic behavior of individual trajectories rather than global averages [1].

One of the challenges we faced while building the model was the lack of readily available kinetic constants. Since dCas9 is a newly discovered gene regulation technology [2], gRNA/dCas9 and gRNA+dCas9/DNA binding/unbinding kinetics are still under investigation. One may imagine that the binding/unbinding constants are somewhat related to the gRNA sequence given the different chemical properties of nucleotide sequences. However, we will consider binding/unbinding constants to be gRNA-independent. For the gRNA/dCas9 interaction, this assumption is justified by the fact that the gRNA scaffold is always the same. For the gRNA+dCas9/DNA interaction, we can justify this hypothesis by thinking of an average nucleotide composition, as our synthetic sequences are randomly generated

dCas9 degradation is a fundamental process in our system. Since high levels of dCas9 are toxic for the cell [3], a constantly varying dCas9 population is required for a signal to propagate from one gate to another (remember, the output of a gate is a gRNA which will bind to a free dCas9 in order to propagate the signal to the next gate, cf Project Description).
We can imagine three different types of degradation for the gRNA/dCas9 complex : degradation of the whole complex, dCas9 detaching from its gRNA and the degradation of the gRNA's targeting sequence (resulting in a non-functional but occupied dCas9). Since the probability of a gRNA unbinding from a dCas9 protein is extremely low [4], we only considered the degradation of the whole complex, thus neglecting the unbinding reaction. In addition, we assumed that this degradation rate is the same of gRNA-free dCas9.

In our project, the dCas9/gRNA complex is used as a transcription factor. It may alternatively activate or inhibit a targeted promoter. An important assumption of our model is that the number of transistors (i.e. the number of promoters) does not change. This implies that the total number of transistors is constant. We defined six different states for a transistor: activated (\(a\)), inhibited (\(i\)), basal (\(b\)), activated/inhibited (\(ai\)), double inhibited (\(ii\)) and activated/double inhibited (\(aii\)). Mathematically, this translates into: \[ [Ta] + [Tb] + [Ti] + [Tai] + [Tii] + [Taii] = \text{cst.} \] where the simultaneous existence of all states was not tested experimentally. This reduces the degrees of freedom of our system of ODEs: \[ \dfrac{d[Ta]}{dt} + \dfrac{d[Tb]}{dt} + \dfrac{d[Ti]}{dt} + \dfrac{d[Tai]}{dt} + \dfrac{d[Tii]}{dt} + \dfrac{d[Taii]}{dt} = 0. \]

To have only first-order reactions we assume that we can reach transistors states with two dCas9/gRNA complexes bound only through simple states. THis imply that is impossible to go from \(Tai\) to \(Tb\) in a single time step: fist a dCas9/gRNA complex detach (leaving the transistor in an active or inhibited state, depending on which complex detached), then the remaining complex can leave as well. This assumption relies on the fact that we neglect the possible interaction of two dCas9/gRNA complexes when bound to the same promotor on different position; in this case the simultaneous detachment of the two dCas9/gRNA complexes has a low probability. The same apply for the states \(Tii\) and \(Taii\).

Summary

Concentrations depends only on time

dCas9 binding/unbinging is gRNA-independent

gRNAs does not unbind from dCas9

gRNA/dCas9 is degraded as a complex, whit the same rate as dCas9 alone

The total namber of transistors is constant

Only one dCas9/gRNA complex can bind/unbind the transistor in a time step.

Equations

Expliciting the large set of ordinary differential equations (ODEs) governing our system is a nontrivial and error prone process. When few species are present, doing it manually and double-checking the equations can be sufficient. However, when the number of transistors composing our system increases (due to the chaining of gates), keeping track of all gRNAs and their activating/inhibiting effect on specific promoters (when bound to dCas9) become almost impossible. For this reason we created a Python program which does the tedious tasks for us (see the Software section). With this script, inputting the circuit structure (i.e the gates with input/outputs) will generate of the entire kinetic model, write down the equations in LaTeX format and create a Python function representing the apposite system of ODEs. This system may then be solved numerically.

Activation

Our activation model consists of a simple transistor that takes a gRNA/dCas9 complex as an input (A) and produces a gRNA (C). The gRNA/dCas9 complex enhances the production of the output C, which is otherwise produced basally. In our experimental setup, the output C is used to enhance the production of GFP to enable a quantitative measurement of the activation (with respect to the basal expression).

dCas9 is constitutively produced from a low copy plasmid and its degradation is proportional to its concentration. The population of unbound dCas9 is also affected by the binding between dCas9 and a gRNA. The unbinding of this two species is neglected (see justification above). \[ \frac{d}{dt}[\text{mRNA}_\text{dCas9}] = \alpha_\text{dCas9} - (\gamma_\text{mRNA}+\gamma) [\text{mRNA}_\text{dCas9}] \] \[ \frac{d}{dt}[\text{dCas9}] = \beta_\text{dCas9}[\text{mRNA}_\text{dCas9}] - \gamma [\text{mRNA}_\text{dCas9}] + D_\text{dCas9}[\text{dCas9-gRNA}] - R_\text{dCas9}[\text{dCas9}][\text{gRNA}] \]

gRNA is produced by an inducible promoter. gRNAs are degraded proportionally to their concentration. Unbound gRNAs may also bind to a free dCas9 protein. As stated previously, the dCas9/gRNA complex is degraded as a whole (at the same rate as unbound dCas9 degradation). Therefore, there is no gRNA production that results from the complex's dissociation. \[ \frac{d}{dt}[\text{gRNA}] = \alpha_\text{gRNA} -(\gamma_{mRNA}+\gamma)[\mathit{gRNA}] + D_\text{dCas0}[\text{dCas9-gRNA}] - R_{dCas9}[\text{dCas9}][\text{gRNA}] \]

dCas9/gRNA complexes are our gene regulatory units: depending on the targeted site of a promoter, these complex can enhance or inhibit gene transcription. To test activation, gRNA sequences were generated to exclusively target activating sites. In this case, the binding of dCas9/gRNA to the promotor enhances transcription, which is otherwise in a basal expression. The binding of a dCas9/gRNA to a promoter creates an activated transistor which is otherwise in a basal state. \[ \frac{d}{dt}[\text{dCas9-gRNA}] = R_\text{dCas9} [\text{dCas9}][\text{gRNA}] - D_\text{dCas0}[\text{dCas9-gRNA}] - \gamma[\text{dCas9-gRNA}] \]

Our simple transistor (used to simulate activation) can be found in two states: activated (\(Ta\)) or basal (\(Tb\)). The switch between basal and activated states is obtained by the binding/unbinding of the dCas9/gRNA complex and is represented by an Hill function, which gives the probability of a dCas9/gRNA complex being attached to the promotor [1]. In order to assess the functionality of our transistor and the targeting of the dCas9/gRNA complex, we used a measurable output: green fluorescent protein (GFP): \[ \frac{d}{dt}[\text{mRNA}_\text{GFP}] = \alpha_{GFP}\frac{[\text{dCas9-gRNA}]}{K_\text{DNA} + [\text{dCas9-gRNA}]} - (\gamma_{mRNA}+\gamma) [\text{mRNA}_\text{GFP}], \] \[ \frac{d}{dt}[\text{GFP}] = \beta_\text{GFP} [\text{mRNA}_\text{GFP}] - (\gamma_{GFP}+\gamma)[\text{GFP}]. \]

Inhibition

Inhibition works similarly to activation, with the difference of the targeting site of the dCas9/gRNA complex. In the case of activation, the dCas9/gRNA complex targets a sequence on DNA such that the RNA polymerase (RNAP) recruiting unit (\(\omega\) unit in E. Coli and VP64 in yeast) is close to the RNAP binding site: the recruiting unit enhance RNAP binding and therefore gene transcription. In the case of inhibition, the dCas9/gRNA targeting sequence on DNA is so close to the RNAP binding site that the RNAP is not able to bind anymore; this steric inhibition prevents gene transcription.

If we interpret the Hill function as a binding probability [1], in the case of inhibition we have that a basal expression is possible when nothing is bound to the promotor. Thus the equation of \(\text{mRNA}_\text{GFP}\) production is now the following: \[ \frac{d}{dt}[\text{mRNA}_\text{GFP}] = \alpha_{GFP}\left( 1 - \frac{[\text{dCas9-gRNA}]}{K_\text{DNA} + [\text{dCas9-gRNA}]}\right) - (\gamma_{mRNA}+\gamma) [\text{mRNA}_\text{GFP}]. \] This means that at low concentrations of \(\text{dCas9-gRNA}\) we have a basal expression, while at high concentrations the production of \(\text{mRNA}_\text{GFP}\) is extremely low.

Activation and inhibition

Double inhibition

Activation and double inhibition

Constants

Name	Description	Value	Source
\(\alpha_{dCas9}\) (E. coli)	mRNA encoding dCas9 production	83 (nM/min)	Estimated
\(\alpha_{gRNA}\) (E. coli)	gRNA production	498 (nM/min)	Estimated
\(\alpha_{GFP}\) (E. coli)	mRNA encoding GFP production	83 (nM/min)	Estimated
\(\alpha_{dCas9}\) (S. cerevisiae)	mRNA encoding dCas9 production
\(\alpha_{gRNA}\) (S. cerevisiae)	gRNA production
\(\alpha_{GFP}\) (S. cerevisiae)	mRNA encoding GFP production
\(\beta_{dCas9}\) (E. coli)	dCas9 production from mRNA	0.49-0.86 (1/min)	Young and Bremer [8]
\(\beta_{dCas9}\) (S. cerevisiae)	dCas9 production from mRNA	0.49-0.86 (1/min)	Bonven and Gulløv [9]
\(\beta_{GFP}\) (E. coli)	GFP production from mRNA	3.02-5.27 (1/min)	Young and Bremer [8]
\(\beta_{GFP}\) (S. cerevisiae)	GFP production from mRNA	1.39-2.34 (1/min)	Bonven and Gulløv [9]
\(\gamma_{mRNA}\) (E. coli)	mRNA degradation	0.2 (1/min)	Bernstein et al. [5]
\(\gamma_{mRNA}\) (S. cerevisiae)	mRNA degradation	0.03 (1/min)	Wang et al. [6]
\(\gamma_{GFP}\) (E. coli)	GFP degradation rate	0.03 (1/h)	BioNumber 105191
\(\gamma_{GFP}\) (S. cerevisiae)	GFP degradation rate
\(\gamma\) (E. coli)	Dilution rate	0.03 (1/h)	Estimated
\(\gamma\) (S. cerevisiae)	Dilution rate	0.03 (1/h)	Estimated
\(K_{dCas9}\)	dCas9 and gRNA dissociation constant	10 (nM)	Wright et al. [11]
\(K_{DNA}\)	dCas9/gRNA and dsDNA dissociation constant	0.04 (nM)	O'Connell et al. [10]

mRNA and gRNA production/degradation rates

For mRNA degradation we considered the mean half-life of Refs. [5-6] and subsequently computed the degradation rate. In Ref. [6] the mean half-life is explicitly stated, while we computed ourself the mean half-life of Ref. [5] for different E. Coli strains. Note that for an exponential decay, the half-life \(t_{1/2}\) is linked to the lifetime \(\tau\) by \[ t_{1/2} = \tau \log(2) \] and the lifetime \(\tau\) is in turn linked to the decay constant \(\Gamma\) by \[ \Gamma = \frac{1}{\tau}. \]

For gRNAs we use the same degradation rate of mRNAs.

The estimation of mRNA/gRNA production rates is more complicated as it depends on promotor characteristics. Here we assume for simplicity that all promotors are of the same kind, the only difference being the number of copies of the plasmid they are in (low-copy, medium-copy, and high-copy). We assume that within a single cell we have 5, 30 and 300 copies of low-, midium- and high-copy plasmids respectively. The promotor we choose is the Lac promotor, which starts one transcription every 6 seconds [12]. This means that the production rate is of 10 mRNA per minute. Now, if we suppose that the cell volume for E. coli is approximately 1µm³ and in S. cerevisiae is 60µm³ [1] we can compute the production rate of mRNAs within the whole cell. We have that GFP and dCas9 are encoded in a low-copy plasmid, while mRNAs are encoded in a medium-copy plasmid.

dCas9 and GFP production/degradation rates

In order to estimate the dCas9 production from mRNA we started from the polipeptide chain elongation rate, from where we estimated the production rate using the protein length.

For E. Coli, we found an polipeptide chain elongation rate of 12-21 amino acids per second (BioNumber 100059, [8]). Since the DNA coding for dCas9 fused with the \(\omega\) subunit is 4374bp (see BioBrick LINK), the total number of amino acids composing the protein is 1458. This gives finally a production rate of 0.49-0.86 protein transcript per minute.

In S. cerevisiae, the polipeptide chain elongation rate is estimated to be 5.5-9.3 amino acids per second [9]. Since the DNA coding for dCas9 fused with the VP64 subunit is 4323bp (see BioBrick LINK), we find an production rate of 0.23-0.39 protein transcript per minute.

In order to find GFP production rates, we follow exactly the same procedure we used for dCas9. The GFP used in E. coli is 717bp, while the GFP used in S. cerevisiae is 714bp.

For GFP degradation we consider the GFP(mut3) half-life time (BioNumber 105191) and we computed the degradation rate in the same way we did for mRNAs.

Fod dCas9 we wern't able to find a satisfactory degradation rate. However, it turns out that dCas9 degrade slowly [13] and therefore we decided to neglect this degradation compared to dilution. Dilution is estimated from the half-life of our organism. For E. coli we have a division every 40 min [14], while for S. cerevisiae we have a division every 1-2 hours (BioNumber 108255). Note from our equation that the dilution rate is added at all degradation rates.

Simulation

As our parameters were estimated roughly or taken from different papers published over many years, we have analyze carefully the results of our model. Do GFP concentration in the range of nM make sense? The total number of proteins in E. coli cells is 2x10⁶, while in Yeast cells it reaches 50x10⁶ [15]. Using the known volume of E. coli and S. cerevisiae, which is 1µm³ and 60µm³ respectively [1], we can compute the protein concentration in µM: for E. coli we have 3x10³ µM, while for S. cerevisiae we have 80x10³ µM. These estimations are in good accord with our simulation, where the final GFP concentration in the activation model contribute to 0.5% of the total protein concentration.

References

[1] R. Phillips et al., Physical Biology of the Cell, Second Edition, Garland Science, 2013.

[2] L. S. Qi et al., Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression, Cell 152, 1173–1183, 2013.

[3]

[4] Team Duke iGEM 2014, Wiki.

[5] Bernstein et al., Global analysis of Escherichia coli RNA degradosome function using DNA microarrays, PNAS, vol. 101, 2758 –2763, 2004.

[6] Wang et al., Precision and functional specificity in mRNA decay, PNAS, vol. 99, 5860 –5865, 2002.

[7]

[8] Young and Bremer, Polypeptide-Chain-Elongation Rate in Escherichia coli B/r as a Function of Growth Rate, Biochemical Journal, Vol. 160, 185-194, 1976.

[9] Bonven and Gulløv, Peptide chain elongation rate and ribosomal activity in Saccharomyces cerevisiae as a function of the growth rate, Molecular and General Genetics, vol. 170, 225-230, 1979.

[10] O'Connell et al., Programmable RNA recognition and cleavage by CRISPR/Cas9, Nature, vol. 516, 263-266, 2014.

[11] Wright at al., Rational design of a split-Cas9 enzyme complex, PNAS, vol. 112, 2015.

[12] F. Eckstein and D. Lilley, Nucleic Acids and Molecular Biology, vol. 4, Springer, 1990.

[13] Team Waterloo iGEM 2014, Wiki.

[14] Kumar and Libchaber, Pressure and Temperature Dependence of Growth and Morphology of Escherichia coli: Experiments and Stochastic Model, Biophysical Journal, vol. 105, 783-793, 2013.

[15] R. Milo, What is the total number of protein molecules per cell volume? A call to rethink some published values, Bioessays, vol. 35, 1050–1055, 2013.

EPFL 2015 iGEM bioLogic Logic Orthogonal gRNA Implemented Circuits

@@ Line 150: / Line 150: @@
                        <td>83 (nM/min)</td>
                        <td>Estimated</td>
+                  </tr>
+                  <tr>
+                      <td>\(\alpha_{dCas9}\) (S. cerevisiae)</td>
+                      <td>mRNA encoding dCas9 production</td>
+                      <td></td>
+                      <td></td>
+                  </tr>
+                  <tr class="even">
+                      <td>\(\alpha_{gRNA}\) (S. cerevisiae)</td>
+                      <td>gRNA production</td>
+                      <td></td>
+                      <td></td>
+                  </tr>
+                  <tr>
+                      <td>\(\alpha_{GFP}\) (S. cerevisiae)</td>
+                      <td>mRNA encoding GFP production</td>
+                      <td></td>
+                      <td></td>
                    </tr>
@@ Line 199: / Line 220: @@
                        <td>0.03 (1/h)</td>
                        <td><a href="" target="_blank">BioNumber 105191</a></td>
+                  </tr>
+                  <tr  class="even">
+                      <td>\(\gamma_{GFP}\) (S. cerevisiae)</td>
+                      <td>GFP degradation rate</td>
+                      <td></td>
+                      <td></td>
                    </tr>
                    <tr>
                        <td>\(\gamma\) (E. coli)</td>
+                      <td>Dilution rate</td>
+                      <td>0.03 (1/h)</td>
+                      <td>Estimated</td>
+                  </tr>
+                  <tr>
+                      <td>\(\gamma\) (S. cerevisiae)</td>
                        <td>Dilution rate</td>
                        <td>0.03 (1/h)</td>
@@ Line 256: / Line 291: @@
            <h2>Simulation</h2>
+          <p>As our parameters were estimated roughly or taken from different papers published over many years, we have analyze carefully the results of our model. Do GFP concentration in the range of nM make sense? The total number of proteins in E. coli cells is 2x10<sup>6</sup>, while in Yeast cells it reaches 50x10<sup>6</sup> [15]. Using the known volume of E. coli and S. cerevisiae, which is 1µm<sup>3</sup> and 60µm<sup>3</sup> respectively [1], we can compute the protein concentration in µM: for E. coli we have 3x10<sup>3</sup> µM, while for S. cerevisiae we have 80x10<sup>3</sup> µM. These estimations are in good accord with our simulation, where the final GFP concentration in the activation model contribute to 0.5% of the total protein concentration.</p>
            <h2>References</h2>
@@ Line 272: / Line 310: @@
            <p>[13] Team Waterloo iGEM 2014, <a href="https://2014.igem.org/Team:Waterloo/Math_Book/CRISPRi">Wiki</a>.</p>
            <p>[14] Kumar and Libchaber, Pressure and Temperature Dependence of Growth and Morphology of Escherichia coli: Experiments and Stochastic Model, Biophysical Journal, vol. 105, 783-793, 2013.</p>
+          <p>[15] R. Milo, What is the total number of protein molecules per cell volume? A call to rethink some published values, Bioessays, vol. 35, 1050–1055, 2013.</p>
        </div>
      </div>

Difference between revisions of "Team:EPF Lausanne/Modeling"