Difference between revisions of "Team:Oxford/Modeling"

Revision as of 21:50, 17 September 2015

Modelling

Introduction

Mathematical modelling plays a crucial role in Synthetic Biology by acting as a link between the conception and the physical realisation of a biological circuit. Our modelling team has focused on building a better picture of the project to evaluate the effectiveness of initial designs, as well as to provide insight into how the system can (or must) be improved.

Our team experimentally validated that Escherichia coli can secrete enzymes which break down the biofilms associated with urinary infections. However, it is difficult to directly measure whether our enzymes are produced in a sufficient quantity to be a more effective treatment than antibiotics. We measure gene expression and diffusion of widely-used chemicals, and then employ our modelling to determine the number of E. coli cells that would make our project a more effective treatment than antibiotics. We expect to have to improve our system to make it realistic.

To help readers of all kinds and specialisations understand this page we have produced guides for all the modelling techniques used in this section. They are available in our Modelling Tutorial page and will be linked to when appropriate.

Gene expression rates

In this section we look at our cells in isolation in order to assess their functionality and answer important questions such as “how long does it take to produce a certain concentration of product?” The end result - the final concentration of useful enzyme that is produced in the cell - is required for our diffusion model.

Arabinose-induced expression

We have decided to use an arabinose induced promoter for the expression of a number of our proteins. This promoter can be modelled as the following chemical system:

\[[Arab:AraC]\overset{K}{\rightarrow}mRNA\overset{\alpha}{\rightarrow}P\] \[mRNA\overset{\gamma_{1}}{\rightarrow}\phi\quad P\overset{\gamma_{2}}{\rightarrow}\phi\]

Our pBAD promoter is known as a double repressor. AraC binds to pBAD which represses transcription of mRNA. By introducing Arabinose into the system, AraC will bind to Arabinose and form the Arab:AraC compound. Transcription can then occur.

For this system we will assume that AraC is always in large concentration and that its binding to arabinose happens on a faster time scale to transcription. Therefore, we do not need to consider the individual concentrations of arabinose and AraC, instead we just need to include the concentration of the complex [Arab:AraC]. The rate \(K\) is not just a simple constant and is given as the hill function in the equations below.

Using Michaelis-Mentin kinematics, we arrive at the equations:

\[\dfrac{d[mRNA]}{dt}=K_{max}\dfrac{[Arab:AraC]^{n}}{K_{half}^{n}+[Arab:AraC]^{n}}-\gamma_{1}[mRNA]\] \[\dfrac{d\left[P\right]}{dt}=\alpha\left[mRNA\right]-\gamma_{2}\left[P\right]\]

Where we define the symbols as:

Symbol	Definition	Initial Value/Literature Value	Fitted
\([Arab:AraC]\)	The concentration of associated Arabinose and AraC	\(0\)	-
\([mRNA]\)	The concentration of mRNA	\(0\)	-
\([P]\)	The concentration of our product	\(0\)	-
\(\alpha\)	Translation rate	\(15ntd\: s^{-1}\)/length of sequence [6]	\(6.6ntd\: s^{-1}\)/length of sequence
\(\gamma_{1}\)	Degradation rate of mRNA	\(5.13\times10^{-4}s^{-1}\) [5]	\(1.1\times10^{-2}s^{-1}\)
\(\gamma_{2}\)	Degradation rate of product	\(5.13\times10^{-4}s^{-1}\) [5]	\(1.1\times10^{-2}s^{-1}\)
\(K_{max}\)	Maximal transcription rate	\(50ntd\: s^{-1}\)/length of sequence [6]	\(47ntd\: s^{-1}\)/length of sequence
\(K_{half}\)	Half-maximal transcription rate	\(160\mu M\) [8]	\(100\mu M\)
\(n\)	Hill coefficient	\(2.65\) [3]	\(2.73\)

This table contains literature values for the parameters, found from a number of sources. We then measured GFP expression in E. coli to extract experimental values. Here are the details of our fitting function.

Fitting our gene expression data to the theoretical model

Results showing GFP concentration as a function of time, matched to our deterministic model. Errors are given to one standard deviation and an arbitrary scaling factor is included as a fitted parameter.

We can now calculate the limiting concentrations that our products will be expressed at. There are mutliple products being expressed, each of different sequence lenghts. Here is a table showing the relevant proteins and sequence lengths:

Product	Sequence Length (/bp)
pBAD HisB DNase DsbA	621
pBAD HisB DspB YebF
pBAD HisB DspB
pBAD HisB MccS	414
pBAD HisB DspB Fla
pBAD HisB Art-175 DsbA	987
pBAD HisB Art-175 YebF	1284
pBAD HisB Art-E	632
pBAD HisB Art-175 Fla	1095
pBAD HisB Art-175	936
pBAD HisB DNase	570

We now can run our model of the system by solving the set of equations using the MATLAB equation solver ode15s. Below is a plot of the concentration of product against time for each protein expressed with this inducer-promoter pair where the expression is induced by a step function:

Model data for each of the enzymes we plan to release, using the parameters we found from our experimental data. We found our limiting concentrations where of order \(\muM\).

The advantage of this method is that we have not had to directly measure expression data for all of our enzymes, which is a difficult process. We conclude that we should obtain enzyme expression of order \(\muM\) within 500 minutes. However, the scaling factor we introduced in our fitting function is no substitute for a calibration curve to match GFP fluorescence with GFP concentration. For this reason, we conservatively estimate that our proteins are expressed at \(nM\) concentration.

Delivery

With the information about the rates of production and concentrations of our products we can look at how the products behave once they leave the cell. Our enzymes are first secreted from the cells, and then through the beads to the biofilms they target. We can provide an estimate of the time scale that our project is working on and assess any need for optimisation of enzyme efficiency.

Dispersin B

Dispersin B is one of the anti-biofilm agents we are using in our project and will be the focus of this delivery section. As such we will assume that conclusions reached apply to all of our enzymes.

A concentration of Dispersin B of 60μg/ml is required to destroy a biofilm that has already formed on a surface. This equates to a concentration of 1.50μM. This is higher than the steady-state gene expression concentration we can expect from our cells, meaning that our system cannot rely solely on diffusion to transport our enzymes to the biofilm. We will therefore model these diffusion systems assuming that our cells are expressing at a 2μM concentration and later we will look at optimising the gene expression to this level.

Beads

Diffusion

The bead delivery system consists of our cells being contained in alginate spheres. Water is passed through the container filled with the beads allowing our enzymes to diffuse from the alginate to the required concentration. More details about the design of the system can be found here.

To determine the convection mass transfer coefficient of Dispersin B from our gel spheres we looked at the diffusion data obtained from this experiment involving the diffusion of crystal violet from our beads. By analysing the system we can produce a theoretical form for the concentration of crystal violet in the bulk water as a function of time:

\[c_{f}=\dfrac{c_{bo}}{1+\frac{V_{f}}{V_{b}}}\left(1-\exp\left(\dfrac{-K_{m}A_{b}\left(1+\frac{V_{f}}{V_{b}}\right)t}{V_{f}}\right)\right)\]

Symbol	Definition	Value	Units
\(A_{b}\)	Total surface area of the beads	\(0.0238\)	\(m^{2}\)
\(V_{b}\)	Total volume of beads	\(1.3463\times10^{-5}\)	\(m^{3}\)
\(c_{bo}\)	Initial concentration in beads	\(0.02451107\)	\(M\)
\(V_{f}\)	Volume of fluid surrounding the beads	\(V_{f}=V_{fo}-\dfrac{1\times10^{-6}}{10}t\)	\(m^{3}\)
\(V_{fo}\)	Initial volume of fluid surrounding the beads	\(1\times10^{-4}\)	\(m^{3}\)
\(t\)	Time	\(-\)	\(min\)
\(c_{f}\)	Concentration of fluid surrounding beads	\(-\)	\(M\)
\(K_{m}\)	Convection mass diffusion coefficient	To be fitted	\(mmin^{-1}\)

The volume of fluid is also a function of time in order to account for the removal of 1ml of water every 10 minutes. The area and volume of the beads is that of 660 spheres with diameter 3.39mm.

However, the number of beads is an estimate. Because of this, in order to fit the curve to the experimental data we must scale the experimental data by an unknown factor. Therefore we preface our equation with an arbitrary scaling factor which, along with the convection diffusion coefficient - \(Km\), is determined by our fitting function.

Our fitting script, detailed here, returned the value of \(K_{m} = 1.7265\times 10^{-5} mmin^{-1}\).

Crystal violet dye diffuses out of alginate beads and concentration is measured. Errors are given to one standard deviation and data is fitted to a deterministic model to determine the mass transfer co-efficient. From this we can determine we would require \(100m^{3}\) of beads to reach the necessary concentration of our own enzymes.

Dispersin B is a significantly larger molecule than crystal violet so this diffusion coefficient will not be close to that for Dispersin B. To correct this we need to make use of similarity. More specifically we take the Sherwood Numbers of the systems to be equal therefore:

\[\left(\dfrac{K_{m}R}{D}\right)_{crystal violet} = \left(\dfrac{K_{m}R}{D}\right)_{Dispersin B}\]

Symbol	Definition	Value	Units
\(D_{crystal violet}\)	Mass diffusivity of crystal violet in water	\(2.8652\times10^{9}\)	\(\mu m^{2}s^{-1}\)
\(D_{Dispersin B}\)	Mass diffusivity of Dispersin B in water	\(100\)	\(\mu m^{2}s^{-1}\)
\(R\)	Radius of bead	\(1.695\)	\(mm\)

By rearranging this we arrive at \(\left(K_{m}\right)_{DispersinB} = 6.03\times10^{-13} mmin^{-1}\)

Mass Exchange

This result allows us to theorise a mass exchange system. As a first estimate we will assume that the flow through the beads is sufficiently slow to use the convection diffusion coefficient found above. It is also assumed that the gene expression happens on a faster time scale than the diffusion from the beads to the water, enabling us to assume the concentration of enzyme in the beads remains constant. This is supported by our gene expression models. We can now visualize how the concentrations of the fluid will vary with distance along the mass exchanger:

Visualisation of the concentrations of the fluid and the beads along our mass exchanger

The overall system can now be described with the equation:

\[J = K_mA\dfrac{c_{fo}-c_{fi}}{\ln\left(\dfrac{c_{B}-c_{fi}}{c_{B}-c_{fo}}\right)}\]

Therefore

\[A = J\dfrac{\ln\left(\dfrac{c_{B}-c_{fi}}{c_{B}-c_{fo}}\right)}{K_{m}\left(c_{fo}-c_{fi}\right)}\]

Where \(J=Q\left(c_{fo}-c_{fi}\right)\) and \(Q\) is the volume flow rate of water. We have chosen a flow rate range of 10-100ml/min as this is accepted as a safe artificial bladder fill rate. This range results in the following number of beads required to reach the desired concentration:

Relationship between the number of bacteria-containment beads required to reach a particular flow rate of our enzymes. These are the flow rates we require for practical use.

Therefore a volume of between \(20.3-203m^3\) of beads is required, assuming a packing efficiency of 64%.

However, as stated earlier this estimation relies on the fluid flowing around the beads is slow enough to be approximated as stationary, meaning that mass transfer occurs as natural convection. Although there may be a very large volume of beads and a slow fluid flow rate, the area through which the fluid can flow is likely small enough that the velocity of the fluid is non-negligable.

References

Made with love by Oxford iGEM

@@ Line 114: / Line 114: @@
                              </p>
                              <p>
-                                 This table contains literature values for the parameters, found from a number of sources. We measured GFP expression in <em>E. coli</em> to create data to fit parameters to. This is how our fitting function works, and our data is given below.
+                                 This table contains literature values for the parameters, found from a number of sources. We then measured GFP expression in <em>E. coli</em> to extract experimental values. Here are the details of our fitting function.
                              </p>
+                            <div class="image image-full">
+                                <img src="https://static.igem.org/mediawiki/2015/d/de/OxiGEM_Gene_Fitter.png" alt="Fitting our gene expression data to the theoretical model" />
+                                <p>
+                                    Results showing GFP concentration as a function of time, matched to our <a class="definition" title="deterministic model" data-content="A deterministic model predicts a single outcome from a given set of circumstances.">deterministic model</a>. Errors are given to one standard deviation and an arbitrary scaling factor is included as a fitted parameter.
+                                </p>
+                            </div>
                              <p>
                                  We can now calculate the limiting concentrations that our products will be expressed at. There are mutliple products being expressed, each of different sequence lenghts. Here is a table showing the relevant proteins and sequence lengths:
@@ Line 224: / Line 230: @@
                              <div class="image image-full">
                                  <img src="https://static.igem.org/mediawiki/2015/f/f6/Ox_arab_induced_proteins.png"/>
+                                <p>
+                                    Model data for each of the enzymes we plan to release, using the parameters we found from our experimental data. We found our limiting concentrations where of order \(\muM\).
+                                </p>
                              </div>
                              <p>
-                                 To calculate if our project will work in real life, we can model the number of cells required to reach a sufficient <a class="definition" title="enzyme" data-content="A molecule which speeds up a chemical reaction - a biological catalyst. The reaction does not involve these molecules.">enzyme</a> secretion rate to stop a <a class="definition" title="biofilm" data-content="A community of bacteria (or other microorganisms) adhering to a surface and each other, held together by secreted slime-like polymers that create a more favorable environment for the bacteria and protect them from environmental stresses and attack from a host’s immune system (in the case of pathogenic bacteria).">biofilm</a> from forming. To do this, we need to measure the rate of secretion of enzymes from our cells.
+                                 The advantage of this method is that we have not had to directly measure expression data for all of our enzymes, which is a difficult process. We conclude that we <em>should</em> obtain enzyme expression of order \(\muM\) within 500 minutes. However, the scaling factor we introduced in our fitting function is no substitute for a calibration curve to match GFP fluorescence with GFP concentration. For this reason, we conservatively estimate that our proteins are expressed at \(nM\) concentration.
-                            </p>
-                            <p>
-                                This can be difficult to do when the protein itself is mixed in with the fluid our cells grow in. GFP is much easier to measure in the lab, so we induced its expression and measured the fluorescence. From this we can find the parameters given in our first table, and then apply the model to a longer - or shorter - <a class="definition" title="gene" data-content="A section of DNA which codes for a protein.">gene</a> and predict expression.
-                            </p>
-                            <p>
-                                Using the graph below, we can fit our expression data to the model presented in the tutorial, where we have included an arbitrary scaling factor (instead of a calibration curve) to match GFP fluorescence to molecule concentrations.
-                            </p>
-                            <p>
-                                <div class="image image-full">
-                                    <img src="https://static.igem.org/mediawiki/2015/d/de/OxiGEM_Gene_Fitter.png" alt="Fitting our gene expression data to the theoretical model" />
-                                    <p>
-                                        Graph depicting results of GFP concentration as a function of time, matched to our <a class="definition" title="deterministic model" data-content="A deterministic model predicts a single outcome from a given set of circumstances.">deterministic model</a>. Errors are given to one standard deviation and an arbitrary scaling factor is employed.
-                                    </p>
-                                </div>
-                            </p>
-                            <p>
-                                By comparing the base-pair length of the GFP gene and of our enzymes, we can expect expression concentrations of the order of \(nM\). This results feeds into our modelling of a mass-exchange system which delivers the enzymes from our cells out to the fluid they're needed in. That model will give us the number of cells required.
                              </p>
                          </div>
@@ Line 251: / Line 243: @@
                          <h2>Delivery</h2>
                          <p>
-                             With the information about the rates of production and concentrations of our products we can look at how the products behave once they leave the cell. This involves modelling the diffusion of the products in different topologies, each associated with a potential physical design of the <a class="definition" title="catheter" data-content="A small, flexible tube inserted into the body to remove fluid. Urinary tract infections are a common side effect of using these.">catheter</a>. With this information we can provide a better estimate of the time scale that our project is working on and assess any need for optimisation.
+                             With the information about the rates of production and concentrations of our products we can look at how the products behave once they leave the cell. Our enzymes are first secreted from the cells, and then through the beads to the biofilms they target. We can provide an estimate of the time scale that our project is working on and assess any need for optimisation of enzyme efficiency.
                          </p>
                          <div id="delivery-dispersin">