Team:UFSCar-Brasil/part2.html

Protein Solubilization toolkit

Where are the singular effects?

Simulation and analysis

In this modeling step, we will develop a study about different chaperones arrangement efficiency in the proteins folding, using as example our protein of interest, Limonene Synthase. As it was not possible to obtain the complete system of this toolkit, there are no available experimental data to reinforce our findings. Nevertheless, the modeling of this section is extremely significant and could be useful to another teams with some similar problems. In this sense, we decided to model our system starting from a simulated dataset where we use fake data, and in the future, real data could simply replace them.

First of all, the experiment to measure the efficiency of each chaperone and arrangement of them was conceived in the following form:

All chaperones would be tested alone (IbpA+B, ClpB and DnaK), in pairs and all together. The proteins would be encoded in plasmid pSB1C3 under control of a constitutive promoter, allied to our gene of interest under the same promoter. Protein of interest would be produced during a few hours, and the cells would be harvested and lysed by ultrasound. Finally, the cell debris would be precipitated by centrifugation and the soluble and insoluble portions would be recovered. Then, in a SDS-PAGE the band of interest would be quantified by optical densitometry. Chaperone efficiency coefficient would be given as the ratio between soluble to insoluble bands.

Simulated data following this procedures were provided in the table below. A control group without any chaperones was included to comparison, and all measurements were made in triplicates. This table shows portion of well folded proteins (soluble) and misfolded ones (insoluble).

Table 1: Simulated data using different chaperone sets and the solubility after the treatment.

Chaperone arrangements Optical density of Insoluble band Optical density of Soluble band
control group without chaperones 800 0
control group without chaperones 865 0
control group without chaperones 855 0
Ibp 600 124
Ibp 658 165
Ibp 659 188
ClpB 649 133
ClpB 655 132
ClpB 689 160
dnaK 699 144
dnaK 699 150
dnaK 689 161
Ibp_ClpB 400 250
Ibp_ClpB 458 266
Ibp_ClpB 423 267
Ibp_dnaK 465 200
Ibp_dnaK 465 220
Ibp_dnaK 462 280
dnaK_clpB 765 180
dnaK_clpB 733 199
dnaK_clpB 756 187
Ibp_Clpb_dnak 300 300
Ibp_Clpb_dnak 356 310
Ibp_Clpb_dnak 346 330

Chaperones efficiency coefficient still means the proportion of correct and incorrectly folded proteins. Considering the case of 100% of solubility, this coefficient tends to infinity, since the insoluble portion tends to zero. In this sense, a case of this type would be impossible to work, to prevent this, we decided work with protein yield. Protein yield coefficient would be given as percent (0-100) and calculated as follows:

$$Y = 100*[Soluble/(Soluble+Insoluble)] \tag{1}$$

Table 2: Chaperone arrangements related to chaperones efficiency and protein yield coefficients.

Chaperone arrangement Soluble/Insoluble Yield %
control group without chaperones 0 0
control group without chaperones 0 0
control group without chaperones 0 0
Ibp 0,20666667 17,12707
Ibp 0,25075988 20,0486
Ibp 0,28528073 22,19599
ClpB 0,20493066 17,00767
ClpB 0,20152672 16,77255
ClpB 0,23222061 18,8457
dnaK 0,20600858 17,08185
dnaK 0,21459227 17,66784
dnaK 0,23367199 18,94118
Ibp_ClpB 0,625 38,46154
Ibp_ClpB 0,58078603 36,74033
Ibp_ClpB 0,63120567 38,69565
Ibp_dnaK 0,43010753 30,07519
Ibp_dnaK 0,47311828 32,11679
Ibp_dnaK 0,60606061 37,73585
dnaK_clpB 0,23529412 19,04762
dnaK_clpB 0,27148704 21,35193
dnaK_clpB 0,2473545 19,83033
Ibp_Clpb_dnak 1 50
Ibp_Clpb_dnak 0,87078652 46,54655
Ibp_Clpb_dnak 0,95375723 48,81657

In order to observe this large set of combinations relate to each other in terms of yield, we used a hierarchical analysis model of clustering available on a free webserver, DENDROUPGMA. The calculations were done used in the creation of dendogram shown in Figure 1. This analysis basically shows together those data that are mathematically closer and separates those who are further away. As we can see, the values obtained with all chaperones acting together, here named "all", are more isolated from other combinations for being the most efficient treatment of the others and so it is statistically more 'distant' to the other groups. Similarly we see the double "Ibp+clpB" and "Ibp+DnaK" grouped in the same node, so we conclude that their values are closer enough to not differentiate them and therefore cannot be separated.

Figure 1: Dendogram of distances using simulated dataset generating a clustering complex behaviour.

Cophenetic correlation coefficient, which brings how feaseable the dendogram is, was estimated as 0.84134 and values higher than 0.7 are considered as a good-fit. It can be seen that the combination of "all" chaperones together is farther away from the others, which was expected since its yield reached the highest value. Arrangements between "Ibp+DnaK" and "Ibp+clpB" are together on a branch, indicating that their yield values are closer than others. In addition, they are nearby branch of combination of all chaperones, indicating its yield not as high as the performance of all chaperones together, although close enough. This may indicate combinations "Ibp+DnaK" and "Ibp+clpB" as good substitutes for all three chaperones, when it would not be possible the complete arrangement.

As possible to observe, the protein yield using chaperones arrangement are not trivial, since there will be cross interactions hard to predict its behaviour jointly taking in account the single values. Predicting the interference between chaperones system, we equate all possible combinations of yield corresponding to each combination and multiply them by a mathematical factor \(x_i \). The goal is to interpret the values of these constants, to make it possible to compare the influence that each chaperone component isolated influences on the set of combinations.

Each index \(x_i\) represents an arrangement of all studied chaperones, in this way: \(x_1 \rightarrow Ibp\); \(x_2\rightarrow Clbp\); \(x_3\rightarrow DnaK\); \(x_4\rightarrow Ibp+Clbp\);\(x_5\rightarrow Ibp+DnaK\);\(x_6 \rightarrow Clbp+DnaK\); \(x_7 \rightarrow Ibp+DnaK+Clpb\).

Results obtained for this system were:

$$x_1=(471/124)x_7-(251/248)x_6 \tag{2}$$
$$x_2= -(314/71)x_7 +(502/213)x_6 \tag{3}$$
$$x_3=(471/109)x_7 - (251/218)x_6 \tag{4}$$
$$x_4=(251/612)x_6 \tag{5}$$
$$x_5=(1884/503)x_7-(502/503)x_6 \tag{6}$$
$$x_6=x_6 \tag{7}$$
$$x_7=x_7 \tag{8}$$

In this way, it will be possible to describe all parameters taking in account just two values \(x_6\) and \(x_7\). There is a relationship between all chaperones and the two combinations (all of them and “DnaK+Clpb”. Fixing one of them, it will be possible to study the behaviour of others using graphical tools (Figure 2 and 3).

Figure 2: Solutions to values of \(x_i\) in function of values of \(x_6\) with \(x_7\) fixed equals to 1.
Figure 3: Solutions to values of \(x_i\) in function of values of \(x_7\) with \(x_6\) fixed equals to 1.

It was possible to see that there is no single solution to our problem, and probably even with actual values there is no single solution. Although, the graphics can nonetheless take some dependency relationships between the coefficients, such as if \(x_7 \) is fixed \(x_2 \) component increases with slope greater than \(x_1 \), which decreases with the increasing of \(x_6 \) value. This indicates that ClpB influence in the chaperone combinations are higher than Ibp in these arrangements. In this sense, this result is reversed to from the moment when increase the amounts related to \(x_7 \). As it was not possible to obtain experimental data to validate our discussions and predict outcomes, we are pleased to leave this modeling as a theoretical model that can be deployed and tested in future projects of our or another iGEM team that wants to employ it.

References

S. Garcia-Vallve, J. Palau and A. Romeu (1999) Horizontal gene transfer in glycosyl hydrolases inferred from codon usage in Escherichia coli and Bacillus subtilis. Molecular Biology and Evololution 16(9):1125-1134.

Our amazing sponsors!