Team:Tianjin/Modeling

内页

HOME
PROJECT
- DESCRIPTION
- RESULTS
- DESIGN
- PROTOCOLS
- FUTURE WORK
PARTS
MODELING
COLLABORATIONS
SAFETY
HUMAN PRACTICE
NOTEBOOK
TEAM

Overview

This year, Tianjin-iGEM makes a good combination of experiment design and modeling and solves all kinds of problems we meet.

What’s the best condition of the two-phase extraction experiment? Do not need compare experiments results in different conditions any more. Using only few data, we can calculate the exactly best condition through computer programs and give great help to our experiments.

How hydrophobin self-assembles on the surface? You really do not have to use electron microscope. See how we propose a physical model using mathematical method and how the model fits well with the experiment data.

Why the application of hydrophobin accelerates the cutinase catalyzed PET hydrolysis? The classic model won’t work. Novel kinetic models are made and it is amazing that we make it visible using computer software.

Mathematical model and computer stimulation give us powerful method to predict, improve and direct our experiments.

Adsorption

We decided to create a model of Janus adsorption for two main reasons:

Describing the self-assembly process of Janus on the solid substrate or air/water interface

To support and complement enzyme kinetic modeling as the basic part of heterogeneous kinetics

Model Formation

As literature review, we know that Class I Janus are able to self-assemble at an air/water or water/oil interface, forming a robust, electron-microscopically identifiable monolayer characterized by a rodlet structure. Yu et al. studied the formation of rodlets of the inJanus from Grifola frondosa. They used a Langmuir trough and found that rodlets formed at the air-water interface during compression of a surface film of inJanus. de Vocht et al. repeated the experiment to have an image of rodlet structure after drying an aqueous solution with scanning force microscope. As for Class II hydrophobins, HFBI, HFBII and HFBIII from T. reesei have been shown to form films with a self-assembled hexagonally ordered structure (Paananen et al. 2003; Kisko et al. 2005; Kisko et al. 2007). Multilayer Langmuir-Blodgett films, HFBI and HFBII proteins formed hexagonal crystallites (Kisko et al. 2005.) Self-assembled hexagonally ordered films of HFBIII on an air/water interface and on a silicon substrate were also studied (Kaisa Kisko et al. 2007)

As evidence given above, the pattern of two classes Janus are totally different.

Class I Janus

A key difference between class I and class II Janus is that only class I Janus form amyloid-like rodlets. Ingrid Macindoe et al. compared three types of EAS: wild-type EAS, EAS_Δ15(a truncated version of EAS that lacks 15 residues from the disordered Cys3–Cys4 loop) and F72G EAS_Δ15(F72 mutate to G) by nuclear magnetic resonance(NMR) spectroscopy. They also carried out molecular docking simulations using the data-driven docking program HADDOCK. Different sets of intermolecular hydrogen bond restraints between neighboring monomers over residues S71-I75 in both parallel and anti-parallel arrangements were tested. The width of the rodlet generated in this way is approximately 65 Å, which agrees very well with the measured width of approximately 61 Å for EAS_Δ15 rodlets(Ann H Kwan,2008)(Figure 1)

Figure 1. Ribbon diagram of EAS_Δ15hexamer in gray showing packing of side chains in the β-spine core and location of charged residues. Side chains of F72, L73, and I74 are shown as purple sticks, N67 as green sticks and T66 and T68 as orange sticks. Red and blue spheres denote positions of Cγ and Cε atoms of Asp an Lys side chains, respectively. Horizontal line illustrates that the majority of charges are located on the “bottom” half of the structure (corresponding to the hydrophilic face).

So, as the data obtained from MD simulation above, the diameter of monomer is 27 Å.

Figure 2. Model of the rodlet unit by overview. Two circles simplify hydrophobins as regular spheres. The connection between dimer is the β-barrel unit.

To figure out the stoichiometric relationship between Janus and surface site, we calculate area of Janus and a unit respectively. We assumed the size of a Janus is far larger than a single surface bound site.

If regarding dimer as a unit (Figure 2), the area of two Janus(S_H)= 1145 Å², and the area of rectangle unit(S_A) = 1755 Å².

So, the elementary reaction equation is simplified as,

Parameters	Meaning
H	Hydrophobin(Janus)
A_s	Active site on the solid surface
k₁	Associated constant
k-₁	Dissociated constant
θ	fractional occupancy of the adsorption sites
τ/τ^∞	Surface concentration of adsorption/ Maximal surface concentration

Hypothesis:

(1) Each site can hold at most one molecule of Janus(mono-layer coverage only)

(2) The surface containing the adsorbing sites is perfectly flat plane with no corrugations (assume the surface is homogeneous)

(3) The adsorption process is reversible

(4) There is an H-bond interaction in the β-barrel forcing to form rodlets.

When adsorption rate equals to desorption rate,

,

simplify it and solve θ, , where

The adsorption isotherm equation is,

Class II Janus

The high resolution of the AFM images from both the air-facing side and the water-facing side enabled comparison of the repeating structures in the AFM images with the dimensions of an HFBI monomer.

Figure 3. Model of the structure of HFBI film. The film viewed from the hydrophobic side (A) and from the hydrophilic side (B). (A) and (B) are processed AFM images showing the average repeating structures in a single crystalline area. In the middle panel and expanding to both sides is a model of the monomer arrangement in the film(Géza R. Szilvay,2007)The model in figure suggests that three monomers forming a trimer are arranged to form a hexagonal pattern.

Figure 4. Simplifying model of the structure unit of class II hydrophobin film. Full line in the circle is valid hydrophobin in one unit and the hexagon is a cell of crystal monolayer film.

To figure out the stoichiometric relationship between hydrophobin and surface site ,we regarded hexagonal cell as a unit. (Figure 4)The area of six effective hydrophobins (S_H)= , and the area of hexagonal cell (S_A) = R.(R is the radius of ideal hydrophobin sphere)

The elementary reaction is as follow,

Parameters	Meaning
H	Hydrophobin(Janus)
A_s	Active site on the solid surface
k₁	Associated constant
k-₁	Dissociated constant
θ	fractional occupancy of the adsorption sites
τ/τ^∞	Surface concentration of adsorption/ Maximal surface concentration

Hypothesis：

(1) Each site can hold at most one molecule of Janus (mono-layer coverage only)

(2) The surface containing the adsorbing sites is perfectly flat plane with no corrugations (assume the surface is homogeneous)

(3) The adsorption process is reversible

(4) There is an interaction between adjacent Janus forming hexagonal crystalline structure.

When adsorption rate equals to desorption rate,

simplify it and solve θ ,

The adsorption isotherm equation is,

Parameter Finding

We turned to the literature to find parameters for our model, given in the Table below. In the past few years, M. Linder el at. were doing well in the field of adsorption of Janus. We looked for parameter values that had been measured in the work M. Linder el at. did, but the substrates they used were alkylated gold surface, salinized surface, polystyrene surface, etc. while what we use is PET(polyethylene terephthalate), which has different surface property. Therefore, we estimated parameters according to the data in their assays on fusion protein. Such aggregating parameters is somewhat uncertain endeavor; thereby predicting a range of the parameters and7 preparing it with the practical experiments. An explanation for how we arrived at each parameter is given in the table.

ADSORPTION PARAMETERS

	Parameter	Value	Description	Source/Rationale
1	τ∞,HFBI-EGIc	0.28μmol•m-2(1.33μg• cm-2)	Maximal surface concentration of sJanus-EGIc fusion protein	Determined base on fitting data into the first-order Langmuir isotherm, as reported in [7].It is the binding isotherm of EGIc- sJanus fusion protein to silanized glass.
1	K_d,HFBI-EGIc	21.8μg•ml-1	Desorption equilibrium constant of sJanus -EGIc fusion protein
2	T∞,HFBI	0.21μg• cm-2	Maximal surface concentration of sJanus	Estimated according to the mass relationship between sJanus and fusion protein sJanus -EGIc, which is based on the assumption that the adsorbing and desorbing ability of sJanus keep same as fusion protein.
2	Kd,HFBI	21.8μg•ml-1	Desorption equilibrium constant of sJanus
3	_T∞,HGFI	0.224μg• cm-2	Maximal surface concentration of sJanus	Estimated according to an assay[8],with series of pH resulting in different adsorbance; thereby there is a range of desorption constant
3	K_d,HGFI	49~280μg•ml^-1	Desorption equilibrium constant of sJanus

As an attachment, we read almost all the assays about sJanus and inJanus adsorption in the paper published and we found diverse conditions in every assay, such as different substrates: alkylated gold, polystyrene, silanized glass, 1-hexanethiol etc. Hence, it is not easy to estimate the certain parameters when the Janus self-assemble on the PET. NO.1 parameters in the table is in the condition where the fusion protein adsorbs on the silanized quartz surface attained from the paper and we estimate the parameters when the protein is pure Janus by ignoring the conformation change during the protein fusion. Besides, theoretically it only applies to silanized surface but we initially think it applies to all hydrophobic surfaces. Further, we used another assay[8] data to estimate the parameters when inJanus adsorb on PET although in this case they used polystyrene as the substrate.

Initial Model Result

Using these estimated parameters, we simulated our model of the adsorption process. Below are two Janus, sJanus and inJanus, have different adsorbance with different concentration of free Janus. On the left we simulated a red line and a blue line, a range of the adsorption isotherm, which corresponds to pH=4 and pH=10, respectively. On the right is sJanus adsorption isotherm. It is convinced that no significant effect of temperature is noted; hence we do not make any note about temperature on the line graph.

In the initial model, it is indicated that two classes of Janus have similar ability of self-assembly. They both have strong hydrophobic area stuck to hydrophobic substrate.

Result in Lab

We set up this model to predict the quantity of absorption wild Janus and fused Janus. We have done the PET hydrolysis experiment with Janus or fused Janus existing, however, we did not have enough time to accomplish further discussion and parameter fitting which we planned to study with QCM(Quartz crystal microbalance) and SPR(Surface plasmon resonance). Despite that, we are working to make it more clear and specific in the future.

Reference

[1]Wang, X., Graveland Bikker, J., Kruif, C. & Robillard, G. Oligomerization of hydrophobin SC3 in solution: From soluble state to self-assembly. Protein Sci. 13, 810–821 (2004).

[2]Sunde, M., Kwan, A., Templeton, M., Beever, R. & Mackay, J. Structural analysis of hydrophobins. Micron 39, 773–84 (2007).

[3]Géza R. Szilvay. Self-assembly of hydrophobin proteins from the fungus Trichoderma reesei.(2007)

[4]Kisko, K. et al. Self-assembled films of hydrophobin protein HFBIII from Trichoderma reesei. J Appl Crystallogr(2007). doi:10.1107/S0021889807001331

[5]Kwan et al. Structural basis for rodlet assembly in fungal hydrophobins. Proc. Natl. Acad. Sci. U.S.A. 103, 3621–3626 (2006).

[6]Kwan, A. H. et al. The Cys3-Cys4 loop of the hydrophobin EAS is not required for rodlet formation and surface activity. J. Mol. Biol. 382, 708–20 (2008).

[7]Linder, M., Szilvay, G., Nakari Setälä, T.Söderlund, H. & Penttilä, M. Surface adhesion of fusion proteins containing the hydrophobins HFBI and HFBII from Trichoderma reesei. Protein Science 11, 2257–2266 (2002).

[8]Wang, Z. et al. Hydrophilic modification of polystyrene with hydrophobin for time-resolved immunofluorometric assay. Biosens Bioelectron 26, 1074–9 (2010).

[9]Takatsuji, Y. et al. Solid-support immobilization of a ‘swing’ fusion protein for enhanced glucose oxidase catalytic activity. Colloids Surf B Biointerfaces 112, 186–91 (2013).

[10]Takahashi,T.et al.Ionic interaction of positive amino acid residues of fungal hydrophobin RolA with acidic amino acid residues of cutinase CutL1. Molecular Microbiology 96,14-27(2015)

PET Degradation

0. Abstract

Modeling of PET degradation consists of three parts.

First, a heterogeneous kinetic model is built to better describe the enzyme catalyzed hydrolysis of PET by combing the Michaelis-Menten equation with the Langmuir equation. Compared with the basic Michaelis-Menten model, we take a adsorption process into account noticing the PET is generally insoluble in the system (a heterogeneous system) and the enzyme should first combine with the enzymatic site then catalyze the hydrolysis.

Second, we propose novel models to describe the effect of the application of Janus into the hydrolysis system based on the hypothetical elementary reactions using the result of self-assembling model and the novel kinetic model in the first part. There are three different ways to apply Janus-before the addition of the cutinase, mixed with the cutinase and fused with the cutinase, so we propose three different models with similar forms so that we can make a comparison easily. The model show that fusion of cutinase and Janus has the best catalyze efficiency.

Third, to visualize the result of our modelling, we wrote a program using MATLAB which simulate the degradation process of the hydrolysis. We use Stochastic Simulation to describe the collision of molecules and set different rates to represent the adsorption percentage. The program can used in different situations by adjustment of the parameters. The program can stimulate as well as predict and instruct the experiment well.

In conclusion, we use math, physical and program models to describe, predict and instruct our project.

Here are the details of the modeling

1 Kinetic model

1.1 classic Michaelis–Menten model

In biochemistry, Michaelis–Menten kinetics is one of the best-known models of enzyme kinetics. The model takes the form of an equation describing the rate of enzymatic reactions, by relating reaction rate v to [S], the concentration of a substrate S.

The reaction are assumed as

Where E the enzyme, S substrate, ES the enzyme-substrate complex and P the reaction product. K1 k-1 k2 are all reaction constant

We assume that the reaction reaches an equilibrium state soon and the ES concentration do not change in the reaction

If the enzyme concentration is E0 when the reaction begins

The rate of the reaction is

Where called Michaelis-Menten constant.

Biochemical reactions involving a single substrate are often assumed to follow Michaelis-Menten kinetics, without regard to the model's underlying assumptions. However, Michaelis-Menten equation predicted results are quite different from the experimental results.

1.2 heterogeneous kinetic model

Why is there such a phenomenon? After a literature review we know the reason. We use slices of PET in our experiment, which is seen insoluble. So the reaction system contains two phase, the solid PET, and the solution, which is called a heterogeneous system. So the reaction process is different from what we call a homogeneous system. We make an assumption that the enzyme Hydrolytic enzymes can only catalyze surface PET. So there are two process, first absorption and then catalyze

where is the surface of the insoluble substrate.

The equilibrium adsorption reaction involves a solid substrate and can be expressed as

where As is the substrate surface area, is the fraction of substrate surface occupied by the ES complex, (1- ) is the “free” surface fraction, and E0 is the initial enzyme concentration.

The theta parameter is derived:

where K= k1/k-1 is the adsorption equilibrium constant.

For the hydrolysis reaction, the proposed model gives the following rate equation:

Whose linear form is

It can be used to easily derive adsorption (K) and hydrolysis rate (k₂) constants from the slope and intercept of the straight line fit to the experimental rate results.

In fact, there is another model ^[1] .It was assumed that the adsorption of the enzyme onto the surface of the substrate obeyed a Freundlich isotherm of the form

Where K is the Freundlich adsorption constant. For adsorption of enzyme from a solution phase onto a two-dimensional phase, n was predicted to be 2/3. Because the rate of the degradation reaction is directly proportional to [EAs] the velocity law can be written as

Here k’ is the velocity constant.

The first model actually uses the same derivation process as Langmuir adsorption model. Inherent within this model, the following assumptions are valid specifically for the simplest case: the adsorption of a single adsorbate onto a series of equivalent sites on the surface of the solid.

1. The surface containing the adsorbing sites is perfectly flat plane with no corrugations (assume the surface is homogeneous).

2. The adsorbing gas adsorbs into an immobile state.

3. All sites are equivalent.

4. Each site can hold at most one molecule of A (mono-layer coverage only).

5. There are no interactions between adsorbate molecules on adjacent sites.

The Langmuir adsorption model is widely used in all kinds of adsorption process.

The second model uses Freundlich adsorption model. As this relationship is entirely empirical, in the case where adsorption behavior can be properly fit by isotherms with a theoretical basis, it is usually appropriate to use such isotherms instead (see for example the Langmuir and BET adsorption theories). So in our experiment we choose the first model.

1.3 parameters finding and model fitting

We turned to the literature to find parameters for our model, given in the Table below.

enzyme	Parameter	Value	Description	Source/Rationale
FsC	[s0]	13miu M	Surface concentration	Determined for PET hydrolysis at 40 degree^[2]
	K	0.41miu M ^-1	Adsorption constant
	k2	0.09miu mol/cm2/h	Reaction rate constant
Thc-Cut1	KM	1.5+-0.1mM	Michaelis-Menten constant	Determined for PNPA(soluble) hydrolysis ^[3]
	Kcat	436s^-1
	KM/Kcat	291
LC	The amount of PET degraded by LC-cutinase at pH 8.0 and 50°C was1.45 mg after incubation for 24 h.^[4]

In our experiment, we use three kinds of cutinase, unfortunately we can find few data fit for our model.(cutinase catalyzed hydrolysis of PET is a new area and there are not so many experiment carried out on it/ many experimenter just uses traditional M-M equation to fit their experiment data )

Parameters of FsC is well-documented and can be used in our model

Here are the result of our model for FsC

2 Janus applying model

2.1 apply Janus before the addition of the cutinase

First, we add hydrophobin(Janus) into the PET system, and the reaction is

With H the hydrophobin, As the surface of the PET

And the hydrophobin will self-assemble on the surface of hydrophobin

From the self-assembling model we can know that different kinds of hydrophobin has different kinds of equations

Take class II hydrophobin as an example,

The surface concentration of hydrophobin is

With the max surface concentration of absorbed hydrophobin，kd the dissociation constant

Then we add cutinase into the system , the reaction can be concluded as

the cutinase will have some ionic interaction with hydrophobin and the reaction is stronger than that PET do^[5]

so we assume that most enzyme will be absorbed by hydrophobin, and then react with PET

in the equilibrium state, the concentration of is steady, which means

if the initial concentration of cutinase is E0

then

Where the K=K-1+K2/K1

so the rate equation is

in fact , the ratiocination process is similar to that of Michaelis–Menten equation

for class one hydrophobin, the rate equation is

for class two hydrophobin, the rate equation is

2.2 hydrophobin fused with cutinase

First, we fuse hydrophobin with cutinase and we get (HE) .we will use X to refer to it

Then, we add it into the system, the reaction process is concluded as follows

Here we can see x as a new enzyme, which means it has the same function as cutinase but with different characteristics

Unlike the first model, here we use the heterogeneous model to describe the process

That is because PET is a solid phase without a concentration

Where k’=k1’/k-1’

2.3 parameters discussion and model fitting

parameter	discussion
K₁ k_-1K	hydrophobin self-assemble on the surface of the PET and cutinase will adsorb on the surface more easily. So k1 will be greater and k-1 will be smaller than those of non-hydrophobin system. K of a non-hydrophobin system is 0.41miu M-1( Fsc) and K here will be greater than that The kd between CutL-1 and hydrophobin RolA is 6.5nM[5]
K₂ k₂'	Since the core reaction is the same—cutinase catalyzed hydrolysis of PET , so we think the k value doesn’t change
K'	It is a main reflection of the characteristic of hydrophobin, so we think k is the same with that of hydrophobin

3. Visualization and programming

3.1 description

To make our result more direct. We use MATLAB to visualize it.

The main idea is to simulate the degradation process of the hydrolysis. To simplify our model, we use a two-dimensional plane with certain area to stimulate PET surface. Besides, we look enzyme and Janus as a point since they are so small compared to the size of PET. In our program, the adsorption is thought to be collision. We use Stochastic Simulation to describe the collision of molecules and set different rates to represent the adsorption percentage. By introducing time dimension we can predict the degradation percentage with plots and pictures.

Here are some basic assumptions about our model

1. The PET surface is treated as a 2D grid (of size n x n)

2. Each grid may be occupied by at most one cutinase or Janus

3. Randomly generating grids means collision of molecules, a ratio will determine how many percentage of these grids will be adsorbed

4. If cutinase is added after Janus, there will be another ratio determine how many cutinase will be absorbed by Janus

5. If cutinase is fused with Janus

What’s more, the program allow users set different rates including the adsorption rate, the reaction rate and the lifetime of enzyme so the program will have a wider application range. The program can be used in different situations by adjustment of the parameters. The program can stimulate as well as predict and instruct the experiment well.

3.2 code

Here are some basic function in our program written by Tsingua-A iGEM.

They have written detailed note to make our program readable.

function dots_e=rdm(n,number)

% rdm produces binary coordinate randomly

% (number,n),where number is the number of coordinates

% you want and n is the n-by-n square surface

%

% Calling sequence:

% array=rdm(number)

% Define variables:

% number --Number of the dots that is absorbed with cutinase

% n --N-by-n surface

% dots_e --The chosen random points that will react with cutinase

disp('The coordinates of degradable points on the simulated surface is as follow')

dots_e=randi(n,number,2);

function combine_dots_bycut=filter_dots(number_h,number_h_cut,dots_h)

% Filter_dots is to filtate a certain array into smaller one randomly,but

% with a certain filtration ratio.

% Calling sequence:

% new_array=combin_cut(number)

% Define variables:

% number_h ---The number of locuses adsorbed by hydrophobins

% number_h_cut ---The number of locuses adsorbed by hydrophobins and

% cutinase

% ratio_h_bind_e ---The ratio that enzyme binds to the hydrophobin of all

% dots_h ---The coordinates of locuses adsorbed by hydrophobins

% i ---The number of locuses adsorbed by hydrophobins and

% cutinase after filtration

% temp_number ---The number array generated randomly in the range of number_h

%

temp_number=randperm(number_h,number_h_cut);

i=1;j=1;

for i=1:length(temp_number)

for j=1:2

combine_dots_bycut(i,j)=dots_h(temp_number(i),j);

end

And here are our main program:

function start_no_h

% Start_no_h is to guide customer to generate a random simulation

% We simulate a n-by-n surface with which the ideal enzyme particles collide.

% This process simulates the behavior of the adsorption of the PET cuticase

% on the PET surface.

% By setting the parameters (the size of surface,the ratio of enzymes

% collision and probabability of chemical reaction)we get the number of

% degradable locuses and each locus's coordinates.

% Calling sequence

% start_no_h

% Define variables:

% n --The n-by-n active locus surface

% ratio_e_pet --The ratio that cutinase binds to locus on pet

% number --The number of locuses reacting with cutinase

% dots_e --The coordinates of points able to react with cutinase

n=input('Please input a n-by-n square surface(n=?):');

ratio_e_pet=input('Please set the ratio of locus binded by cutinase on surface:([0,1])');

number=n*ratio_e_pet;

dots_e=rdm(n,number)

function start_h_e

% Start_h_e is to guide customer to generate a random simulation

% We simulate a n-by-n surface with which the ideal hydrophobin particles

% collide and cutinase bind hydrophobin then degrade PET.

% This process simulates the behavior of the adsorption of the hydrophobin

% on the PET surface and cutinase interact with hydrophobin anchoring on

% the PETsurface and degrading it.

% By setting the parameters (the size of surface,the ratio of hydrophobins

% collision, ratio of cutinase effectively anchored on hydrophobin and

% probabability of chemical reaction).We get the number of degradable

% locuses and each locus's coordinates

% Calling sequence:

% start_h_e

% Define variables:

% n --The n-by-n active locus surface

% ratio_h_pet --The ratio of hydrophobin adsorbing on pet

% ratio_h_cut --The ratio of cutinase binding on the hydrophobin when

% collision occurs

% ratio_degrade_pet --The ratio of effective degration reaction happens when

% cutinase has bound on the hydrophobin

% number_h --The number of locuses absorbed by hydrophobin

% number_h_cut --the number of locuses absorbed by hydrophobin bound

% with cutinase

% number_degrade --The number of locuses degraded by cutinase

% dots_e --The coordinates of points able to react with cutinase

% combine_dots_btcut --The coordinate of the locuses adsorbed hydrophobin

% and binded cutinase

% degrade_dots_bycut --The coordinate of the locuses degraded by cutinase

n=input('Please input a n-by-n square surface(n=?):');

ratio_h_pet=input('Please set the ratio of locus binded by hydrophobin on surface:([0,1])');

ratio_h_cut=input('Please set the ratio of combination hydrophobin with cutinase:([0,1])');

ratio_degrade_pet=input('Please set the ratio of locus degraded by cutinase:([0,1])');

number_h=n*ratio_h_pet;

number_h_cut=number_h*ratio_h_cut;

number_degrade=number_h_cut*ratio_degrade_pet;

dots_h=rdm(n,number_h);

combine_dots_bycut=filter_dots(number_h,number_h_cut,dots_h);

degrade_dots_bycut=filter_dots(number_h_cut,number_degrade,combine_dots_bycut);

fprintf('The number of locuses degraded by cutinase is %1d\n',number_degrade);

disp('The coordinates of degraded locuses is as follow');

for i=1:number_degrade

for j=1:2

fprintf('%2d ',degrade_dots_bycut(i,j));

end

fprintf('\n');

end

function start_h_fused_e

% Start_h_e is to guide customer to generate a random simulation.

% We simulate a n-by-n surface to which the ideal hydrophobin particles are

% absorbed.

% This process simulates the behavior of the absorption of the hydrophobins

% fused with cutinase on the PET surface.

% By setting the parameters (the size of surface,the ratio of enzymes

% collision and probabability of chemical reaction)we get the number of

% degradable locuses and each locus's coordinates.

% Calling sequence

% start_h_fused_e

% Define variables:

% n --The n-by-n active locus surface

% ratio_e_pet --The ratio that cutinase binds to locus on pet

% ratio_degrade_pet --The ratio that cutinase degrades the pet

% number_degrade --The number of locuses degraded by cutinase

% number_h --The number of locuses adsorbed by hydrophobins fused with

% cutinase

% dots_h_fused_e --The coordinates of locuses adsorbed by hydrpophobin

% fused with cutinase

% degrade_dots_bycut --The coordinates of locuses degraded by cutinase

%To prompt client to input parameters

n=input('Please input a n-by-n square surface(n=?):'); % Length of side of n*n square

ratio_h_pet=input('Please set the ratio of locus adsorbed by hydrophobin on surface:([0,1])');

ratio_degrade_pet=input('Please set the ratio of locus degraded by cutinase:([0,1])');

number_h=n*ratio_h_pet;

number_degrade=number_h*ratio_degrade_pet;

dots_h_fused_e=rdm(n,number_h);

degrade_dots_bycut=filter_dots(number_h,number_degrade,dots_h_fused_e);

fprintf('The number of locuses degraded by cutinase is %1d\n',number_degrade);

disp('The coordinates of degraded locuses is as follow');

for i=1:number_degrade

for j=1:2

fprintf('%2d ',degrade_dots_bycut(i,j));

end

fprintf('\n');

end

References

[1] McLaren, A. D. Enzyme reactions in structurally restricted systems. IV. The digestion of insoluble substrates by hydrolytic enzymes. Enzimologia 26, 237.(1963)

[2] A ˚ sa M. Ronkvist, Wenchun Xie, Wenhua Lu, and Richard A. Gross Cutinase-Catalyzed Hydrolysis of Poly(ethylene terephthalate). Macromolecules 42, 51285138 (2009).

[3] Acero, E. et al. Surface engineering of a cutinase from Thermobifida cellulosilytica for improved polyester hydrolysis. Biotechnol. Bioeng. 110, 2581–2590 (2013).

[4] Sulaiman, S. et al. Isolation of a novel cutinase homolog with polyethylene terephthalate-degrading activity from leaf-branch compost by using a metagenomic approach. Appl. Environ. Microbiol. 78, 1556–62 (2011).

[5] Toru Takahashi. et al. Ionic interaction of positive amino acid residues of fungal hydrophobin RolA with acidic amino acid residues of cutinase CutL1. Molecular Microbiology. 96(1), 14–27(2015)

[6] Figeroa, Y., Hinks, D. & Montero, G. A Heterogeneous Kinetic Model for the Cutinase-Catalyzed Hydrolysis of Cyclo-tris-ethylene Terephthalate. Biotechnol. Prog. 22, 1209–1214 (2006)

[7] Wang, Z., Lienemann, M., Qiau, M. & Linder, M. Mechanisms of protein adhesion on surface films of hydrophobin. Langmuir 26, 8491–6 (2010).

[8] Mariastella Scandola, Maria Letizia Focarete, and Giovanna FrisoniSimple. Kinetic Model for the Heterogeneous Enzymatic Hydrolysis of Natural Poly(3-hydroxybutyrate). Macromolecules31, 3846-3851(1998)

Protein Extraction

Introduction

The purpose of this part of modelling is trying to found out optimize process conditions for the effective partitioning of our fusion protein and a way to proof whether or not the novel two-step extraction method is applicable to any other kinds of fusion proteins constructed by other target protein of interest and hydrophobin. Moreover, we’d like to put a mighty-possible way to find the feasible experimental condition for any other target protein.

Part 1. Optimize process conditions modelling

Given the fact that our independent variables might to be concerned (Triton molecular weights and concentrations, temperature, phase volume ratios R, pH, fusion protein’s surface hydrophobicity, charge, structure, and molecular weight) do have a lot interaction between each other, we can see it that a high level of non-linearity exists in our system, complicating the study of the behavior in the aqueous two-phase extraction. What’s more, the common method of experimental design and optimization where operating conditions are optimized by changing one variable at a time, gives no guarantee for the optimal parameter determination. In order to simplify our problem, we decided to use the response surface methodology (RSM) in our model, which is based on the statistical theory. In this case, our model is less theoretical, nonetheless effective in our situation.

Following we present you our optimization model.

1. Response surface methodology (RSM)

RSM consists of a group of mathematical and statistical techniques for seeking the optimum conditions in a system which had multi-variable and complex interactions existing between the parameters. It can be used to define the relationships between the response variables (in our project are: recovery percentage and the purity of protein) and the independent variables (in our project are: the molecular weight、the concentration and the Grand average of hydropathicity of fusion protein (GRAVY) and phase volume ratios R) and also generates a mathematical model that describes the process.

The formula below is the basic equation used in RSM, in which x_ix_j represents the interaction between two parameters, β is the equation coefficients. As for why we choose x_i, x_i², x_ix_j to regress? Here is the reason: take Taylor's formula as an instance, when the second-order can meet the accuracy requirements, than we shall set the polynomial as a second-order one, cause third-order is not necessary. The same reason, the following second-order model can meet our requirements so we do not set it as a third-order one.

2. Experiment data

To find out the statistical results of the model, we designed several experiments to gain our statistics by using the central composite rotatable design, considering four independent variables: the molecular weight、the concentration and the Grand average of hydropathicity of fusion protein and phase volume ratios R. The range of variables was chosen according to pre-experiment, during which we studied respective individual parameter. After the experiments were carried out, a regression analysis was carried out to extract equation coefficient in the model by using the software SPSS (Statistical Product and Service Solutions).

3. Supplementary Analysis on the Model

Principal Component Analysis (PCA)

In order to simplify our model, PCA is used to reduce the variables. Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. After the transformation is done, we minimize the variables and carry on the work.

4. Data Analysis and Model Validation

Due to the fact that by the time we submit the wiki, our experiments are still undergoing, so we can only process the few data we already got. Thanks to the useful software SPSS, which is a tiny but powerful tool speaking of statistical problems. We solved the basic statistical analysis by SPSS, here’s the results.

Another thing to mention is that we define a response variable as purity ratio, which equals the purity of target protein after extraction divide by purity of target protein before extraction.

protein	GRAVY(x₁)	Phase ratio(x₂)	Purity before	Purity after	Purity ratio(y)
GFP	-0.338	0.02	59.8	88.7	1.483277592
GFP	-0.338	0.05	94.5	100	1.058201058
GFP	-0.338	0.05	86.7	90.1	1.039215686
GFP	-0.338	0.05	26.4	74.6	2.825757576
BFP	-0.303	0.20	27.5	69	2.509090909
RFP	-0.563	0.02	19.5	78.6	4.030769231
RFP	-0.563	0.05	19.5	60.9	3.123076923

The experiment data

And , and has the confident of 73.0%

And here are some figures:

These three figures illustrate the purify ratio from different point of view.

5. Sensitivity Analysis of the Model

To examine the influence on the response variables, we write the sensitivity analysis’s program, however, we do not get enough data so we did not do this part right now.

6. Conclusions

We can see that when we do the regression, the hydropathicity (x₁) has been neglect, we may say that the target protein may be of any kind because the hydropathicity is not taken into account (Notice that here I only use the word “may”).

7. Strengths and Weaknesses

Strengths

Get rid of the complexity of the cases and draw a convincing conclusion despite of the complicate situation.

The result is applied to all at a wide confidence interval and easy to get once you have limited statistics.

Above all, the model is easy to establish and be used during research.

Weaknesses

The results can only be reached after you have some basic experiments (statistics), and cannot be made if you have nothing in hand. In other word, the model can only be used to forecast by practically rather than theoretically.

The burden of the computer increases as the data increases rapidly, in this case the model may not be suitable for occasions where there are too many variables (by saying too many, we mean more than 10 or so).

8. Where to improve

Take our project into consideration, it is still a problem that we do not have enough time to conduct so many experiments, there’s two way to solve the problem: the first is that although it is the best to conduct 31 experiments designed above, but consider there is only 15 coefficients for us to determined which gives the equations freedom, it is ok if we reduce the experiments down to 20 or so. However if you still think that is too much, then we can only solve the problem through the second method, that is we should find another way to modelling, maybe a more theoretical way that we do not have to depend the data too much.

Part 2. Feasible experimental condition modelling

Raise questions

In the process of modeling, we use the regression method to find the best experimental conditions. In the gradient experiment, the experimental conditions of the three kinds of proteins were obtained, which were 60, 70 and 80, take the best effect as 100 for example. Then we used the experimental data to do RSM analysis. The experimental conditions were independent variables, the extraction effect was variable. After the analysis the regression equation was obtained, and by using the value of derivative is zero we can get the best experimental conditions when the maximum effect value is 100 points.

And then, using the experimental data we have, what can we do if we want to use from three kinds of protein, with its separate properties (such as molecular weight and isoelectric point, hydrophilic and hydrophobic) as decision variables, to predict the acceptable conditions of any other proteins (regard the effect greater than 70 points as acceptable), whose properties were known, and how to generalize?

Analysis problem

The difficulty of the problem lies in, because the mechanism of extraction system is unclear, so the establishment of the mechanism model is difficult to achieve; and return to the regression which can get accurate results without knowing the mechanism well, it has always been a large sum of data to do regression analysis to predict, never have I heard small number of data can do regression to predict well. First of all, with little data may get no results (thinking that when the number of coefficients is a lot, then the data need more too to determine the coefficients’ value), not to mention the accuracy, which makes regression become meaningless. If you really have to make predictions using less data, you can consider interpolation method.

Interpolation and regression

Both interpolation and regression are widely used for prediction and forecasting, however, there are some differences between them. Normally there is a regression function defined in terms of a finite number of unknown parameters that are estimated from the data on the basis of minimize the sum of residuals, so the regression function do not have to fit the data exactly. The effect of regression often depends to some extent on making assumptions about the process, what’s more, these assumptions are required to have a sufficient quantity of data to observe, otherwise regression can easily give misleading results. Interpolation is more used for a small number of numerical processing, through the known data points, to construct new data points or a resulting function smoothly fit within the range of a discrete set of known data points to perform the function of prediction and forecasting. In the process of interpolation, we do not need to know what the function is like, complicated or simple, whatever, we only need to do some calculation. Errors exist as well as the regression analysis, however, considering the situation where we do not have much data to use, and the resulting function is fitted by the data, we may say that interpolation is a better way compared to regression. And the same thing between interpolation and regression is that, because the dimension of our prediction function depends on the number of data points, so with small sum of data points, the dimension of our prediction function is not high, the accuracy may not be too high.

The problem of few data has temporarily been solved by choosing the interpolation instead of the regression analysis, and the selection of the interpolation also lead to new problems. First of all, we deal with the first-order interpolation more often, but considering the characteristics of the protein data, there are three variables, and therefore we have to use multidimensional interpolation; second, there are more than one interpolation method, which is the best choice? The answer to this question is: we don’t know either. There is no conclusion or rule for interpolation method we use, we can only say that, try as much method as possible, then you may find the best answer.

Problem solving

As we discussed before, the selection of interpolation method is varied, so we start with the most simple one, multivariate polynomial interpolation in the Lagrange type interpolation (the most simple one should be multivariate linear interpolation, but taking the accuracy into account we start from polynomial interpolation rather than linear one). Choose the selected protein’s molecular weight, isoelectric point and hydrophilic as three independent variables, and experimental conditions as dependent variables for multiple times interpolation. Method is as follows:

Three-dimensional space Lagrange interpolation formula

Suppose a rectangular ,

Given interval [a, b] of a division ;

Given interval [c, d] of a division ;

Given interval [s, t] of a division ;

Then constitute a division point set of T.

Set the function ,

Where equals the value at each point of the division point set, and is a polynomial of n-ordered x, m-ordered y, and l-ordered z.

Let be , then it is easy to know that:,

And is the Lagrange interpolation polynomial for the division point set.

And the Lagrange interpolation basis function should meet the requirements below:

,

Let be, based on the Lagrange interpolation basis function of simple function we can know:

So, the interpolation polynomial of on the division point set ∆ is:

In our project, we just have to put the experimental data into the formula, and the last work is to calculate.

Appendix

Here is the program we used in modelling.

No.1

clc

clear

load pz.txt

%pz.txt is the text where you store your data,put the independent variables

%on the left rows,and the responce variables on the right rows.

n=4;m=2;

%n is the number of the independent variables

%m is the number of the responce variables

x0=pz(:,1:n);

y0=pz(:,n+1:end);

x=x0(:,3);

y=x0(:,4);

z=y0(:,1);

[X,Y]=meshgrid(min(x):0.01:max(x),min(y):0.001:max(y));

Z=griddata(x,y,z,X,Y,'v4');

figure(1)

[c,h]=contour3(X,Y,Z);hold on

colorbar

hold off

meshc(X,Y,Z);

figure(2)

pcolor(X,Y,Z);shading interp

colorbar

figure(3)

[c,h]=contour(X,Y,Z);

set(h,'ShowText','on');

colorbar

No.2

function [ output_args ] = RSM( A0,A1,A2,A3,A4,A11,A22,A33,A44,...

A12,A13,A14,A23,A24,A34 )

x1=32;

x2=16;

[x3,x4]=meshgrid(4:0.1:8,0.05:0.01:0.45);

f=A0+A1*x1+A2*x2+A3*x3+A4*x4+A11*x1*x1+A22*x2*x2+A33*x3.*x3+A44*x4.*x4+...

A12*x1*x2+A13*x1*x3+A14*x1*x4+A23*x2*x3+A24*x2*x4+A34*x3.*x4;

Z=f;

figure(4)

surf(x3,x4,Z)

figure(5)

[c,h]=contour(x3,x4,Z);

set(h,'ShowText','on');

colorbar

end

No.3

function [ ] = maxRSM( A0,A1,A2,A3,A4,A11,A22,A33,A44,...

A12,A13,A14,A23,A24,A34 )

syms x1 x2 x3 x4 y;

f=A0+A1*x1+A2*x2+A3*x3+A4*x4+A11*x1.*x1+A22*x2.*x2+A33*x3.*x3+A44*x4.*x4+...

A12*x1.*x2+A13*x1.*x3+A14*x1.*x4+A23*x2.*x3+A24*x2.*x4+A34*x3.*x4;

fx=y-(A0+A1*x1+A2*x2+A3*x3+A4*x4+A11*x1.*x1+A22*x2.*x2+A33*x3.*x3+A44*x4.*x4+... A12*x1.*x2+A13*x1.*x3+A14*x1.*x4+A23*x2.*x3+A24*x2.*x4+A34*x3.*x4);

eq1=diff(f,'x1');

eq2=diff(f,'x2');

eq3=diff(f,'x3');

eq4=diff(f,'x4');

[X1 X2 X3 X4]=solve(eq1,eq2,eq3,eq4);

X1=vpa(X1,5)

X2=vpa(X2,5)

X3=vpa(X3,5)

X4=vpa(X4,4)

dx1=diff(solve(fx,x1),'y');

dx2=diff(solve(fx,x2),'y');

dx3=diff(solve(fx,x3),'y');

dx4=diff(solve(fx,x4),'y');

maxf=subs(f,{x1,x2,x3,x4},{X1,X2,X3,X4})

r1=vpa(maxf./X1./subs(dx1,{x1,x2,x3,x4,y},{X1,X2,X3,X4,maxf}),5)

r2=vpa(maxf./X2./subs(dx2,{x1,x2,x3,x4,y},{X1,X2,X3,X4,maxf}),5)

r3=vpa(maxf./X3./subs(dx3,{x1,x2,x3,x4,y},{X1,X2,X3,X4,maxf}),5)

r4=vpa(maxf./X4./subs(dx4,{x1,x2,x3,x4,y},{X1,X2,X3,X4,maxf}),5)