Team:Oxford/Test/Modeling2

Modeling

Introduction

Modeling allows us to observe what happens to a system when you change the parameters that describe it. By quantifying the system’s response to these variables, we can aid the wet lab team in their design of their biological system. By repeating the modeling, data-fitting and refining schedule we approach a sufficient description of our required design.

Our modeling is made up of three pieces: gene expression, diffusion and biofilm modeling. By modeling gene expression, we can inform the lab team which parameters they can choose freely - and which ones they must pick carefully - to maximise the output of our product. The gene copy number is one example of these parameters. By modeling diffusion, we can incorporate into our previous model the effect of our protein being separated from the biofilm it is trying to interact with. If this is a large effect, the lab team must change the design of their system. Finally, by modeling the formation of a biofilm and its interaction with our product, we can decide upon the required concentrations of product we need to clear the urinary tract infection. In all cases we use both deterministic and stochastic models to investigate our system.

Gene Expression

Deterministic

A deterministic model evolves according to ordinary or partial differential equations. We model the following system:

\[\overset{\alpha_{1}}{\rightarrow}Arab\quad\overset{\alpha_{5}}{\rightarrow}AraC\] \[Arab+AraC\mathrel{\mathop{\rightleftharpoons}^{\mathrm{\alpha_{2}}}_{\mathrm{\alpha_{3}}}}Arab:AraC\] \[\overset{K}{\rightarrow}mRNA\overset{\alpha_{4}}{\rightarrow}P\] \[Arab\overset{\gamma_{1}}{\rightarrow}\phi\quad AraC\overset{\gamma_{2}}{\rightarrow}\phi\quad mRNA\overset{\gamma_{3}}{\rightarrow}\phi\quad P\overset{\gamma_{4}}{\rightarrow}\phi\]

Using the following deterministic equations [4]:

\[\dfrac{d\left[Arab\right]}{dt}=\alpha_{1}+\alpha_{2}\left[Arab\right]\left[AraC\right]-\alpha_{3}\left[Arab:AraC\right]-\gamma_{1}\left[Arab\right]\] \[\dfrac{d\left[Arab:AraC\right]}{dt}=\alpha_{3}\left[Arab:AraC\right]-\alpha_{2}\left[Arab\right]\left[AraC\right]\] \[\dfrac{d\left[AraC\right]}{dt}=\alpha_{5}+\alpha_{2}\left[Arab\right]\left[AraC\right]-\alpha_{3}\left[Arab:AraC\right]-\gamma_{2}\left[AraC\right]\] \[\dfrac{d[mRNA]}{dt}=K_{max}\dfrac{[Arab:AraC]^{n}}{K_{half}^{n}+[Arab:AraC]^{n}}-\gamma_{3}[mRNA]\] \[\dfrac{d\left[P\right]}{dt}=\alpha_{4}\left[mRNA\right]-\gamma_{4}\left[P\right]\]

Where we define the symbols

Symbol Definition Initial Value/Literature Value Fitted
\([Arab]\) The concentration of Arabinose \(1\times10^{-4}M\) -
\([AraC]\) The concentration of AraC \(1\times10^{-5}M\) -
\([Arab:AraC]\) The concentration of associated Arabinose and AraC \(0\) -
\([mRNA]\) The concentration of mRNA \(0\) -
\([P]\) The concentration of our product \(0\) -
\(\alpha_{1}\) Basal production of Arabinose ??? ?
\(\alpha_{2}\) Association constant \(2.8\times10^{7}s^{-1}\) [7] ?
\(\alpha_{3}\) Dissociation constant \(0.022s^{-1}\) [7] ?
\(\alpha_{4}\) Translation rate \(15ntd\: s^{-1}\)/length of sequence [6] ?
\(\alpha_{5}\) Basal production of AraC ??? ?
\(\gamma_{1}\) Degradation rate of Arabinose \(5.13\times10^{-4}s^{-1}\) [5] ?
\(\gamma_{2}\) Degradation rate of AraC \(5.13\times10^{-4}s^{-1}\) [5] ?
\(\gamma_{3}\) Degradation rate of mRNA \(5.13\times10^{-4}s^{-1}\) [5] ?
\(\gamma_{4}\) Degradation rate of product \(5.13\times10^{-4}s^{-1}\) [5] ?
\(K_{max}\) Maximal transcription rate \(50ntd\: s^{-1}\)/length of sequence [6] ?
\(K_{half}\) Half-maximal transcription rate \(160\mu M\) [8] ?
\(n\) Hill coefficient \(2.65\) [3] ?

To evaluate these parameters, we will fit the data created by the wet lab team. We allow a tolerance of a factor of ten either side of our initial guess for each parameter. The fitted parameters will be given in the fourth column and will be found using the MATLAB function fmincon. To solve the equations numerically we use the MATLAB equation solver ode15s. We plot three solutions for various parameters below.

Stochastic

A stochastic model is anything which includes some random element, and is useful for modeling small systems or those with some uncertainty. We produced a video tutorial on how to model gene expression networks stochastically using Gillespie's algorithm.[2, 4].

Gillespie's Algorithm

  1. Find random numbers \(r_{1}\) and \(r_{2}\) which are uniformly distributed between 0 and 1.
  2. Find propensities \(\alpha_{i}\) of \(N\) possible reactions. Find \(\alpha_{0}=\sum\limits_{i=1}^N \alpha_{i}\).
  3. Compute time \(\tau=\dfrac{1}{\alpha_{0}}\ln\left(\dfrac{1}{r_{1}}\right)\).
  4. Find the reaction \(j\) such that \(\dfrac{1}{\alpha_{0}}\sum\limits_{i=1}^{j-1} \alpha_{i}\leq r_{2}<\dfrac{1}{\alpha_{0}}\sum\limits_{i=1}^{j} \alpha_{i}\).
  5. Implement reaction \(j\) and increment time \(t\) to \(t+\tau\).
  6. Repeat steps 1-5 until a sufficient time or computation budget has been exhausted.

Turning the reaction rates in our model into propensities is simple - for example, the propensity for the association reaction \(Arab+AraC\rightarrow Arab:AraC\) is \(\alpha_{3}\left[Arab\right]\left[AraC\right]\).

Diffusion

Deterministic

The standard from for the diffusion equation,

\[\dfrac{d[P]}{dt}=D\dfrac{d^{2}[P]}{dx^{2}}\]

where \([P]\) is the concentration of our product, \(t\) is time and \(x\) is distance and \(D\) is the diffusion constant, can be solved analytically in any number of dimensions:

\[[P]\left(x,t\right)=\dfrac{1}{\left(4\pi Dt\right)^{\frac{N}{2}}}\exp\left(\dfrac{-\sum\limits_{i=1}^N x_{i}^{2}}{4Dt}\right)\]

where \(x_{i}\) is a generalised position co-ordinate and \(N\) is the dimensionality of the system. This solution only applies to a source of [P] at the position \(x=0\). We can also solve for a grid of 1 or 2D points using finite difference models given in the tutorial section.

Stochastic

We have the option of using three stochastic models for diffusion - random walk, random velocity and compartmental Gillespie.

Random Walk

We initially take a collection of particles and, for each one, compute the number

\[dr=\sqrt{2Ddt}\xi_{r}\]

where \(dt\) is the size of our timestep, \(dr\) is the absolute distance moved in the timestep and \(\xi_{r}\) is a normally distributed random number with mean 0 and variance 1. To calculate the distance that particles have actually moved depends on what dimensionality we are operating in. We work in polar or spherical poar co-ordinates in 2D and 3D respectively, and so sample angles \(\phi\) and \(\theta\) and uniformly distributed in the ranges [0,\(2\pi\)] and [0,\(\pi\)].

Random Velocity

We initially take a collection of particles at position \(\mathbf{r_{0}}\). We give them random velocities uniformly distributed in the range \(\left[-2\mathbf{v},2\mathbf{v}\right]\) (where \(v\) is determined by thermal fluctuations) and decide that the direction of these velocities should change direction (or not) with some probability after every timestep. We summarise this process in the algorithm below.

  1. For each particle, generate a random speed \(s\) in a random direction.
  2. Compute the position of the particle after a timestep \(\Delta t\).
  3. Generate a random number \(r\) uniformly distributed between 0 and 1.
  4. If \(r<\dfrac{s^{2}}{2D}\Delta t\), calculate a new random direction of velocity for the particle.
  5. Repeat steps 2-5 for each particle until a certain time or computational budget is exhausted.

Compartmental diffusion; Gillespie

We can divide our region into a system of compartments

\[A_{1} \mathrel{\mathop{\rightleftharpoons}^{\mathrm{k}}_{\mathrm{k}}} A_{2} \mathrel{\mathop{\rightleftharpoons}^{\mathrm{k}}_{\mathrm{k}}} ... \mathrel{\mathop{\rightleftharpoons}^{\mathrm{k}}_{\mathrm{k}}} A_{N-1} \mathrel{\mathop{\rightleftharpoons}^{\mathrm{k}}_{\mathrm{k}}} A_{N}\]

such that the only difference in their propensities is related to the number of particles contained in each compartment. This will produce a correct model of diffusion if the proportionality constant \(k\) is chosen as

\[k=\dfrac{D}{(dx)^{2}}\]

where \(dx\) is the physical size of each compartment [2]. We now run the Gillespie algorithm but with \(2(N-1)\) reactions to consider.

Comparison of methods

All of the methods provide similar levels of accuracy (depending on the size of the compartments used in the Gillespie model) and so we choose based on the simplicity of implementation. The random walk method is generally the fastest and easiest to code, so we choose this one.

The challenge

Since we are modelling diffusion in the catheter, the topology of the system is a ring, neglecting edge effects of the cylinder our bacteria will be placed in. To do this we use a random walk model in two dimensions. We confine our particles to a ring of a fixed radius and initially coat the ring with particles.

Modeling Tutorial

Deterministic

Our goal is to model a system of reactions such as the reversible reaction

\[A \mathrel{\mathop{\rightleftharpoons}^{\mathrm{k_{+}}}_{\mathrm{k_{-}}}} B\]

Where \(A\) and \(B\) are our two species and \(k_{+}\) and \(k_{-}\) will be determined shortly. To do this, we make the intuitive assumption that the rate at which something reacts is proportional to how much of it we've got. That is, the rate of change of \(B\) with time is

\[\dfrac{dB}{dt}=k_{+}A\]

where \(k_{+}\) is simply a proportionality constant. This assumption is called the law of mass action. This of course is not the complete picture, as we forgot to include the fact that \(B\) can react back to form \(A\). So the complete system of equations to solve is

\[\dfrac{dA}{dt}=k_{-}B-k_{+}A\] \[\dfrac{dB}{dt}=k_{+}A-k_{-}B\]

At this point, you can type these into your favourite computing program and solve them.

Stochastic

The simplest way to add an element of 'randomness' into an equation is to add a random term into the differential equation. This turns it into a stochastic differential equation, such as:

\[\dfrac{dx}{dt}=c+r\]

which determines the evolution of a particle travelling at a speed \(c\), with a random number \(r\) added into the mix just for fun. We choose \(r\) to be normally distributed with a mean value of 0 and a variance of, say, 1. We evaluate the position of the particle at multiple timesteps (with some small time \(dt\) separating them) and plot our solution in the Figure below.

We can do better than this when dealing with systems such as our example in the previous section. We use the Gillespie algorithm to find:

  1. The time taken between reactions
  2. The reaction that took place in that time.

as neither of these could be determined by our initial method [2, 4].

1) To find the time taken, we make an assumption about the form of the probability distribution that the time between reactions holds. It makes sense to think of it as a falling exponential, such that reactions more often happen quickly rather than slowly, and that the more probable a reaction is, the shorter the time between subsequent reactions. This naturally leads to the form

\[P(t)=e^{-\alpha_{0}t}\]

where \(P(t)\) is the probability of any reaction occuring in time \(t\), and \(\alpha_{0}\) tells us something about the likelihood of the reaction. We plot this in the Figure. In fact we call \(\alpha_{0}\) the sum of the propensities \(\alpha_{i}\) (where the dummy variable \(i\) runs from 1 to the number of possible reactions, \(N\), that could occur). In the case of our example system, the propensity for the reaction \(A\) to \(B\) would be \(k_{+}A\) and the propensity for the reaction \(B\) to \(A\) is \(k_{-}B\) so in this case \(\alpha_{0}=k_{+}A+k_{-}B\).

To decide the time it took for a reaction to occur, we sample a random number \(r_{1}\) in the range [0,1] and stick that number on the y axis of the graph. We read off from our graph a corresponding time and we're done - we now know when the next reaction took place.

2) To decide which reaction took place, we need the probabilities of each reaction occuring being proportional to the ratio of their propensities. We split up the domain of 0 to 1 into the normalised ratios of the propensities of our problem, recalling the propensities we calculated. We could have, in this case, selected the backwards reaction B to A. We can generalise this to the form given in the Gillespie Algorithm above.

This is a caption. Let's see what it looks like.

Finite Difference Models

For finite difference models, we set up a grid of positions that we want to solve our equations at, separated by some fixed difference \(dx\) and at time steps separated by time \(dt\). Subscripts of our concentration P represent the spacial co-ordinates of the grid-point, while superscript represents the location in time. We define a constant \(r=\dfrac{dt}{(dx)^{2}}\) which must be less than 0.5 for the scheme to be convergent.

The 1D formula to iterate over for the finite difference model of diffusion is

\[P_{j}^{n+1}=(1-2r)P_{j}^{n}+r(P_{j-1}^{n}+P_{j+1}^{n})\]

The 2D formula to iterate over is

\[P_{i,j}^{n+1}=(1-4r)P_{i,j}^{n}+r(P_{i,j-1}^{n}+P_{i,j+1}^{n}+P_{i-1,j}^{n}+P_{i+1,j}^{n})\]

Similarly, for 3D:

\[P_{i,j,k}^{n+1}=(1-6r)P_{i,j,k}^{n}+r(P_{i,j-1,k}^{n}+P_{i,j+1,k}^{n}+P_{i-1,j,k}^{n}+P_{i+1,j,k}^{n}+P_{i,j,k-1}^{n}+P_{i,j,k+1}^{n})\]

we (arbitrarily) choose our boundary conditions to be continuous at the boundary.

References

  1. Synthetic Biology - A Primer, Paul S. Freemont, Richard I Kenne et al., 2012 Imperial College Press
  2. R. Erban, S. J. Chapman, and P. Maini. A practical guide to stochastic simulations of reaction- diffusion processes, 2007. Available here
  3. Salto R, Delgado A, Michán C, Marqués S, Ramos JL. Modulation of the function of the signal receptor domain of XylR, a member of a family of prokaryotic enhancer-like positive regulators. J Bacteriol. 1998 Feb180(3):600-4. p.601 right column
  4. Brian Ingalls,Mathematical Modeling in Systems Biology: An Introduction, 2012 University of Waterloo
  5. Liang ST, Ehrenberg M, Dennis P, Bremer H. Decay of rplN and lacZ mRNA in Escherichia coli. J Mol Biol. 1999 May 14 288(4):521-38. p.524 right column bottom paragraph
  6. Proshkin S, Rahmouni AR, Mironov A, Nudler E. Cooperation between translating ribosomes and RNA polymerase in transcription elongation. Science. 2010 Apr 23 328(5977):504-8. p.505 table 1
  7. Nelson HC, Sauer RT. Lambda repressor mutations that increase the affinity and specificity of operator binding.Cell. 1985 Sep42(2):549-58. p.552 table 3
  8. Sourjik V, Berg HC. Functional interactions between receptors in bacterial chemotaxis. Nature. 2004 Mar 25 428(6981):437-41.p.439 left column top paragraph