Team:SYSU-Software/Modeling

SYSU-SOFTWARE IGEM 2015

Modeling Overview

Our whole model is mainly considering about the integrated system providing a set of rules and principles to help modify the ODEs of those devices we quote in the software, which we will make further explanations in the latter paragraphs. To make more reasonable and flexible rules, we have studied several papers and nearly 30 ODE models from the iGEM team participants in the past few years, and notice that they have many structures and constructing rules in common. We choose some meaningful devices of those models and adjust their formulae forms in order to systematically apply them to the numerical simulation and circuits building, hence users can invoke them from our database, combine them and accomplish more complex functions. Any codes of the model we used can be found in this Github Page, and details of adjusted formulae of present devices (approximately 50) in database can be seen in attachments(PDF).
Furthermore, users of CORE can build their own formulae in provided form, enter in their own data as parameters-thus a new device of his or her own can be easily created. We also provide users with the automatic function of circuit building and numerical simulation of his new device, as we mentioned in the former sections, if it really works in sense of synthetic biology. In this way, users can easily get reach of the comparison of simulation and his experiment data when parameters change or foresee the circuit production before experiments, which will help avoid the unnecessary steps and testing.

Model Formation

General Rules Applied

Theorem

We’ve studied several papers discussing ODE applications in synthetic biology and dozens of former models of iGEM teams, found that Hill equation, the three forms of which is shown below(cited from Wikipedia[1]), plays the central part in those models:

Parameters and signals are shown below:

This equation describes “the fraction of the macromolecule saturated by ligand as a function of the ligand concentration”, and is used in determining the degree of cooperativeness of the ligand binding to the enzyme or receptor. As our model is mainly aiming at the mathematical simulation of the production of circuits, Hill equation can be a cornerstone, which accords with the former ODE models of circuits of other teams.

Limitations and Assumptions

In our ODE model of certain circuits we only consider about the transcription of mRNA, the translation of protein and the inner promotion or inhibition of other chemicals. We also make the assumption that when a coding sequence promotes or inhibits the production of other substances, it takes its effect through the translation of corresponding proteins.

We choose not to count the influences as below into our models:

·Leakage
For some operons, leakage does exist. For lac operons, repressor protein LacI can bind to the operator of lac operons and repress the transcription of lac operons in the absence of lactose. However, chances are that the concentration of LacI deviates from its normal one and no LacI binds to the operator. Under such circumstance the repressor protein LacI cannot repress the transcription from lac promoter.
However, we can neglect the rate of leakage for the following two reasons. First, the rate of leakage is very low compared with the strength of LacI binding to lac operator.
· The distribution of RNA polymerases, ribosomes and chemicals
The molecular weight of RNA polymerases, ribosomes and chemicals is small compared to the size of a cell, and the number of RNA polymerases, ribosomes and chemicals is limited. So the distribution of these materials in a cell is uneven. The unevenness might affect the binding efficiency of protein and materials, the dissociation efficiency, and the local concentration of one particular materials.
However, metabolism of the cell can reduce or eliminate the unevenness of distribution in cells. Based on the theory of flux balanced analysis, for a particular metabolism pathway the rate of consumption of inputs and the production of outputs is equal. So the concentration of materials in the cells remains (nearly) unchanged. Further, the inner mass of a living cells is under constant movement; this kind of moving can reduce or eliminate the unevenness of distribution in cells.
So we assume that the distribution of RNA polymerases, ribosomes and chemicals are even.
·The growth of cells
Cells will grow in size, and the concentration of materials will change with the growing cells. However, in iGEM teams usually use bacteria as chassis, and E. coli divides every 20 minutes. So the time with which E. coli grows is short, and the growth of cells can be neglected.
·The division of cells
Cell might divide into two, and the concentration of materials will reduce to half of its original. However, the distribution of materials into two new daughter cells is not determined. For example, when the yeast cells divides, the new cells will be much smaller than their mother cells, and obviously the number of different kind of materials in mother cells and daughter cells is different. (Although the concentration can be thought of as remain the same.) Because the division of cells into two cells, no matter whether the mother cells and the daughter cells are of the same size or different sizes, the concentration of these materials remain roughly the same, we can neglect the impact of division of cells on the concentration of two daughter cells.

We also need to point out the limitations of modeling:

· We does not consider the impact of cell types and cell strains on the constant of formulae. As we all know, different kinds of cells contains different concentration of the same materials. So if we consider the impact of cell types and cell strains on the constant of formulae, the formulae will be more accurate. However, due to the limited time we spent on the formulae, it is impossible for the formulae to take this into considerations.
· In our formulae we have considered the impact of binding efficiency and dissociation efficiency of two materials (like, a protein and its ligands). However, as a software team, it is difficult for us to obtain the exact values of the binding constants.
· In the formulae we adopt the Hill functions as the blueprints of our formulae. Although many other kinds of formulae can be adopted, we adopt the Hill functions because they are mostly used by teams of previous years. So, in order to construct a mathematical model that is of universal use, we choose to adopt the Hill equation as the basis of our equations.

Basic Form of ODEs

Formulae Form concerning genes and proteins

We set the basic Concentration-Time ODEs form of a certain protein α as below:

in which ХPromoter is an Eigen function of the promoter of the circuit lining with α. In other words, when the needed promoter exists, the function value equals to 1; otherwise equals to 0. Depending on the judging mechanism, we can write codes to decide the function mode in different situations. Here we only raise one example of Eigen functions, but actually it is widely used in our detailed models for circuit devices, serving for other judging rules. kα represents the production or translation rate of αprotein, and d represents the degradation rate of αprotein. [αgene], [α] mean the concentration of αgene and αprotein. In the basic form we only consider the production and degradation of protein.

Formulae Form of Promotion and Inhibition Relationships

Promotion and inhibition happen in nearly all the circuits, and if we want to design more complex functions of certain circuits, we usually add specific proteins or chemicals, or build up some inter-inhibited structures to control the protein production as expected. When these human-added control are demonstrated in the formulae, they will be acting as the changes of parameters or the concentration of some variables. This kind of promotion and inhibition are shown in fixed form of terms in formulae, mainly based on the Hill equation form. Examples are given as below:
(The complex LuxU promotes the production of LuxUp, chosen from the model of Device: Receptor)

In the relation we set the parameters β as the production rate of LuxUp Protein, as the degradation rate of LuxUp Protein. The inhibition effect from LuxU to LuxUp is defined in the term , in which kcu represents the repression coefficient of LuxU. Obviously, when the concentration of LuxU increases, the amount of LuxUp will correspondingly increase. (The exterior IPTG inhibits the function of LacI protein, chosen from the model of Device: Biomaker)

In the relation we named LacIF to represent the effective portion of LacI protein, by which we in fact lower the concentration of LacI protein. The concentration of IPTG, which is signaled as [IPTG], is set in the denominator to express its effect of inhibition. The parameter kIPTG is used as standardization, to better measure the IPTG effect on LacI Protein. Obviously, the Hill coefficient here is set as 1.
Hereby we clearly state the easiest form of promotion and inhibition relationships between two parts. But sometimes they will be shown more complex but in structured way, when a chemical is influenced by more than one substance, for example:
(The production of CI λ is inhibited simultaneously by FAdP and Rhamnose, chosen from the model of Device: Kill Switch)

Likewise, we use the form of Hill equation to indicate the relationship between FAdP and CI λ, Rhamnose and CI λ. When FAdP does not exist, the term of FAdP will equals to 1, which does not make influence on the whole function, so does Rhamnose. The multi-one relationship is demonstrated as the product of terms of each factor. If there exists chain reactions in devices (actually it is very common), our measures is to make simultaneous equations to reach the combined solution, which is an important method when dealing with the multi-relationships in devices. Further can be described in details in the example subpage.

We have to mention the situation when a promoter is promoted or inhibited by chemicals, as it actually occupies a significant part of our formulae of devices. We acquiesce that the influence will be forced on the translation and transcription of coding sequences that link with the promoter, by lowering the effective gene amount in functions. Also, as we mentioned before, there will be an Eigen function of the promoter, in order to reach the automatic recognition of it.

Explanation for our work

Our major work in modeling is to rewrite the models of present devices under our mechanism, which includes nearly 50 device models and its related formulae. All the models can be seen in the attachments (PDF). The formulae can be divided into two parts:ones concerning only two parts, while others expressing the variation of the concentration of one central variable and all other linked factors. Each formula may get its distortions considering the multi-solutions whether those influencing factors exist, all of which are calculated and stored in our database. With the former ones our software can automatically draw the circuit graphs in design field, and the latter ones are widely used when users are making numerical simulation of the circuits. We also take into consideration that the devices are put together and users may need to add some extra relations between devices. As we stated before, we already have independent formulae systems for each device, so the software will recognize and put all the related equations together, then finding the combined solutions using SciPy. (More details about the recognition process, see the algorithm in Back-end page.) If users want to add their new relations to the circuits, they need to enter the accurate equations and parameters with their values and units in the form we required, so that the front end can add new lines to the circuit graph, and the back end will store the new equations if succeed, which is also a part of crowdsourcing in the software. Due to the limited time, we have not finish the function to show the specific properties of analytic solutions of those simultaneous ODE equations; we only give their numerical solutions but that is enough to forecast the possible trend of concentration of certain parts. Turn to the example page

Simulation Function

Here we briefly introduce the numerical simulation function in modeling field of our software. Below is a demo of the simulation of a given model, in which concentration-time curves of chosen parts are shown in the same graph. In this graph, CI, UVR8-TetR, GFP, TetR, YFP and UVB curves are drawn in the selected time period.

In the same operating area, users can define the simulation time and initial quantities of the necessary variables in equations. If there is no inputs, all the conditions will be set as fixed values in the software.

Searching Function of Equations

Users can find all the possible equations now existing in the database through the searching function in design area, if it does have. The choosing buttons and answer demo are shown below:

Self-defined Function of Equations

In the design area users can add their self-defined equations, if there is no existing ones, which will be stored in the shared database so that all the users can invoke them. NOTE that the inputs should meet the requirements (rules) in the guidance of Help. Correspondingly, the circuit graph in design function will be updated along with the new relationship lines.