Team:Freiburg/Results/Modeling

""

Modeling

Life as Complex Biochemical Networks

Nowadays, more and more information on biochemical networks is collected, and a lot is already known yet. Understanding biochemical networks is in general important for understanding signaling pathways within organisms, especially regarding medical health issues. Furthermore, it is needed for building new networks from scratch using synthetic biology.
However, the more components a network involves, the harder it gets to estimate how a complex network reacts to changes from both the inside and the outside. This is especially a problem, if entities change both in time and space. Experimentally studying all different components of a network for different possible conditions often is not only time-consumptive but also very expensive and and prove often to be impossible at the desired level.
Mathematical modeling of biological networks therefore is a powerful tool to support experimental research, predicting the behavior of networks both concerning time and space. These predictions can then be validated experimentally.

Two crucial biological processes within the DiaCHIP are cell-free expression and the diffusion of proteins to the specific chemical surface. Cell-free expression limits the DiaCHIP concerning amount and production rate . Protein diffusion limits the system regarding space thereby determining the maximum protein spot density one can still resolve. Therefore, we decided to model both cell-free expression and protein diffusion to predict the behavior of protein concentration in time and space.

With a model like this, it is possible to find the minimum time needed to get a sufficiently high concentration of protein for antibody detection. This reduces the time needed for diagnosis using the DiaCHIP. Furthermore, not only protein, but also other molecules' concentrations can be simulated. Therefore, bottlenecks within the biochemical system as well as limiting substrates can be predicted. Specifically increasing the concentration of such limiting substrates as well as substances involved in bottleneck processes may increase not only the overall protein yield but also speeds up synthesis.
Moreover, a cell-free expression model has huge potential for application on all systems dealing with protein expression in general. Although designed for cell-free systems, it represents the central dogma of biology and therefore mathematically describes a lot of processes involved in many biochemical networks.

Overview

Our System consists of the three biochemical and biophysical processes cell-free transcription, cell-free translation and diffusion of the protein to the chemical surface, as shown in figure 1. With our model we aim to simulate cell-free expression to predict the amount of protein synthesized during the course of time. We set up a system of ordinary differential equations (ODE System) based on the law of mass action, describing the kinetics of the single processes occurring during transcription and translation. Starting from a detailed description including 95 different entities and 100 different parameters, we defined additional assumptions to simplify our system to a total of 65 entities and 44 parameters. Kinetic constants and all other parameter values needed were researched in literature. If no values were found, appropriate values were suggested. As the final model can not be solved analytically, it was solved numerically using Mathematica. [REFERENCE]

Figure 1: Overview of the DIA-Chip as biochemical system.

Model Decription

1. Transcription

Transcription describes the processes of mRNA synthesis from the DNA template, the first part of the central dogma of biology, as shown in figure 2. Unspecific binding is also displayed here. The detailed system inherits 33 different entities described by their corresponding ODEs and 31 parameters. Using a set of XX assumptions, the system was simplified to 14 entities and 10 parameters.

Detailed System

Figure 2: Reactions and processes inherited in the biochemical network of transcription. Unspecific binding is displayed. The nomenclature of the single entities' and parameters' abbreviations shown is given in the section 'nomenclature'.

[UNDERLYING ASSUMPTION] - DONE
[NOMENCLATURE] [ODE SYSTEM] - DONE

Simplified System

[UNDERLYING ASSUMPTION] - DONE
[NOMENCLATURE]
[ODE SYSTEM] - DONE

2. Translation

Translation describes the processes of protein synthesis from the mRNA template, the second part of the central dogma of biology, as shown in figure 3. Unspecific binding is neither displayed in the figure not included into the mathematical description due to the sheer amount of entities and parameters inherited. The detailed system consists of 62 different entities described by their corresponding ODEs and 69 parameters. Using a set of XX assumptions, the system was simplified to 51 entities and 34 parameters.

Detailed System

[PICTURE] - DONE

Figure 3: Reactions and processes inherited in the biochemical network of translation. Unspecific binding is not displayed. The nomenclature of the single entities' and parameters' abbreviations shown is given in the section 'nomenclature'.

[UNDERLYING ASSUMPTION] - DONE
[NOMENCLATURE] [ODE SYSTEM] - DONE

Simplified System

[UNDERLYING ASSUMPTION] - DONE
[NOMENCLATURE] [ODE SYSTEM] - DONE

3. Diffusion

3.1 Introduction and Motivation

In the final step of cellfree expression proteins being produced are diffusing inside the microfluidic chamber. We modeled an ideal case to provide a tool:
On the PDMS slide spots of bound DNA produce proteins with steadily decreasing production rate. The product is distributed homogeneously on the spot and starts diffusing freely in the cell-free mix. Furthermore besides convection through gravitation any interaction is assumed to be negligibly small. The coated iRIf glass is expected to be an ideal sink; any proteins reaching the slide are bound and therefore do not contribute to diffusion anymore.

What knowledge did we want to gain by modeling?
- Time optimization: When is the most efficient time to stop the expression?
- Product optimization: How much of the totally produced proteins does bind to the surface?
- Spot distance optimization: How is the bound protein distributed on the glass slide?

In order to achieve this we constructed the following system.

3.2 Model System

Effective modeling depends on wisely chosen assumptions and a minmal complex system. The main assumptions we considered can be seen in chapter 3.4 and will be referenced by A1,A2, ...

Assume the following setup: A microfluidic system is given consisting of a spotted PDMS slide (multiple spots) and the chemically treated iRIf slide. Yet instead of observing this system of multiple spots we reduce our problem to a cylindrical area around one spot. This area has a radius of L and still the same width W as the whole system. Ideally the spot is a perfect circle which allows us to reduce the problem to a twodimensional one: No matter from which angle the cylinder is observed the identical behaviour is seen. Therefore a crosssection through it provides enough information.

(PIC: Freiburg_MOD_IM_01_Simplification) Illustrated above is a crosssection of the system around one spot (A1). Due to assumption A2 the number of geometrical degrees of freedom can be reduced to 2.

While the expression is done the system lies as in the picture shown above, parallel to the ground. Therefore we included in our model system possible movements due to gravitation. The analytical equation describing this process is the diffusion convection equation:

(Formel: Freiburg_MOD_01_PDE)

It is an inhomogeneous parabolic partial differential equation (PDE). The diffusion constant D depends on the media and materials involved in the system, the inhomogenity describes sources of the system. The velocity vector v is assumed to be a possible sedimentation velocity (A4) and therefore a constant. The initial and boundary conditions are the following: The PDMS slide as well as the iRIf slide are closed surfaces while the left and right edges of the system are not. On the PDMS slide proteins are produced by a logistic function (A5). The solution to this system is only numerically solvable. Therefore it has to be prepared mathematically by discretizing the whole setup. Finally we used the Crank-Nicholson approach on a finite-differences-method to solve this system.

3.3 Numerical approach

Let us start by describing the crosssection in terms of mathematics: It is a subset M of the set of twodimensional real vectors. Of this subset we define two more subsets S and B describing the DNA spot and the glass slide respectively.

(For_18)

Two time &\delta; is the spot width, its thickness is ε. The thickness of the binding area is ξ yet we choose them to be the same (A6). Next we have to simplify our PDE: It consists of three terms on the right side which can be (in our case) shortened to the following components:

(For: Freiburg_MOD_25_Diff_Simp)

The next step is the change from a continuous to a discrete system: First we describe our concentration function as one depending on three integers (places x and z, time t):

(For: Freiburg_MOD_09_Disc )

Also, instead of handling every component individually we define a concentration vector containing all concentration values of M at a static time step:

(For: Freiburg_MOD_13_Conc_Vec )

We do the same for our source by defining a production vector at the time t:

(For: Freiburg_MOD_26_F_Vec )

Why is this done? Because we can split our grid into a connected line of rows. The first row of the grid then would be given by the first components of the vector, followed by the second row and so on.

(Pic: Freiburg_IM_02_Mat_To_Vec )

Next we discretize the derivations of the PDE by exchanging the differentials by finite differences. In this case differentials can be seen as the rate of change in one (x-,z- or) t-step (secants so to say). For a first and second derivation this would be written as:

(For: Freiburg_MOD_06_Taylor )

Using the integer multiplicated step values for our PDE components this results in:

(For: Freiburg_MOD_10_Diff_Expan )

Logically this method is called "finite differences method". For conveniences sake two parameters λ and μ are defined and used in the upcoming simplifications:

(For: Freiburg_MOD_11_Lambda_Mu )

There are different approaches with the finite differences method to approximately solve the PDE. We decided to use an ansatz by Crank and Nicholson which has the following form: A functions change in one time step is determined by the average of momentary and following time step of the right hand side of the PDE. Or, in mathematical terms:

(For: Freiburg_MOD_07_Crank_Nicholson )

Combining the PDE parts and splitting the concentration values into following and momentary time steps the problem can be solved by solving a linear system of coupled equations in place components of the concentration:

(For: Freiburg_MOD_12_CN_Scheme )

This is an iterative method: The values of the upcoming time step are calculated by the ones from the former time step. The equations cannot be solved independently from one another. In a grid of NW rows and NL columns this a system of NW times NL equations that have to be solved every step.

What can be seen in the scheme above is the underlying geometrical ansatz: Diffusion and convection are done only with direct grid neighbours (two vertical and two horizontal ones). Logically this holds true only for non-border grid points. For the systems borders our boundary conditions have to be considered which manifest in the following conditions:

(For: Freiburg_MOD_19_Boundary )

Inserting those in the main Crank-Nicholson-scheme leads to the border descriptions:

(For: Freiburg_MOD_20_Boundary_Eqs )

Combining two border conditions leads to corner conditions which have to be considered too. They follow the same calculation yielding:

(For: Freiburg_MOD_24_Corner_Boundaries )

Being a linear system of equations this problem can be replaced by a short form using matrix-vector multiplication (hence the definition of the vectors from before):

(For: Freiburg_MOD_17_CN_System )

Where the two matrices A and B consist of the factors accompanying the concentration values. They are rather large and are sketched below:

(For: Freiburg_MOD_21_Matrix_A )

(For: Freiburg_MOD_22_Matrix_B )

This concludes the theoretical part of the diffusion. The next step was to implement this system using python 3.4.2 including numpy and matplotlib. Source codes can be obtained here:
LINK -> Python file (Data) LINK -> Python file (Animation)
The first file produces two output files. One of them contains the protein distribution on the glass slide at the finishing time. The other one consists of the total bound protein at the surface of the slide. The second file returns pictures of the ongoing diffusion giving an insight of the diffusion process in the crosssection system.

3.3 Results: For once we made an animation of the diffusion using exemplary values. You can see a video of the diffusion process down below:
VIDEO
Also we checked the protein binding for different spot widths: 20%, 40%, 60% and 80% of the total crosssection width. While the binding distribution returned gaussian profiles the total binding like the production

3.4 Assumptions

A1: The spots are equally distributed and distance to the borders of the microfluidic chamber is at least one spot-to-spot distance.

A2: The spots are perfect circles and the binding site at the iRIf glass is homohegeneous -> Around one spot cylindrical symmetry is given.

A3: The area of produced proteins is as thick as the one of bound ones -> χ = ε

A4: Movement by virtue of gravitation is considered to already be in a stationary status (constant movement).

A5: Protein production is described by a logistic function with a minimal value close to zero and maximum value of 100 (percent).

A6: ε = ξ = Δ x = Δ z

Results

1. Transcription and Translation

We wanted to test our model's ability to predict the concentrations of the cell-free expression biochemical network entities and to adapt to different conditions. Given this properties, our model has the potential to reveal limiting resources and bottleneck reactions and to test optimized parameter settings for the resulting effects on the biochemical network behavior.

We run the model using the default starting concentrations and parameter values as given in the following, tracking the concentration time course of free RNA polymerase, free 30s and 50s ribosomal subunits, free 70s ribosome, free amino acids, free tRNA, free initiation factors, free elongation factors, free nucleotides, free mRNA and free protein, to examine the network's predicted behavior. Afterwards, we tracked the protein and mRNA concentrations in three more simulations, changing one entity starting concentration each, to examine the network's predicted reaction and compare it to the reaction expected.

Network Behavior Under Default Parameter Values

Network Reaction To Changed Parameter Values

expected: low RNAP: slower high RNAP: faster observed: expected: low 70s: slower high70s: faster observed: expected: low GTP: slower and lower high GTP: faster and higher observed:

Summary and Model Optimizations

Our model describing cell-free expression, including 65 entites and 44 parameters, showed the ability of both predicting network behavior and network reaction to changed conditions. Although the model's predicted output does not fit to real cell-free expression yet, there is a huge capability of making the model more specific. Using the set of assumptions defined, the model can be extended step by step, bringing it closer to the cell-free expression biochemical network.