Team:Queens Canada/Modeling

MODELING: INTRODUCTION

How many times have you gone to do something, put your heart and soul into it and then found out it hasn't worked? Your hours of laborious effort turned all for nothing? Us too and this year, we set out to avoid this very dilemma, or at least to try and minimize its effects on our project.

The modeling process was used to gain an understanding of what we expected from the wet lab work. The principle behind design was to troubleshoot and optimize the engineered components through simulations to identify mistakes within the theoretical space before using time and resources in the lab.

This preliminary design can be divided into two components: modeling a circular AFP and scaffold design.

MODELING A CIRCULAR AFP

Icefinity

Figure 1. Crystal structure of the Type III Ocean Pout.This protein is represented by the PDB file 1AME. The distance between the two termini was found using Pymol1.

After research different antifreeze proteins, we decided to work with Type III AFP, from the ocean pout. Relatively active in antifreeeze activity, it serves as an ideal AFP for use in industrial purposes. Furthermore, its termini are only 19.8 angstroms apart, making it easier to circularize using a smaller linker (Figure 1).

The modeling process for the circularization of an antifreeze protein can be explained by a three-step approach to protein design (seen below).

Linker Design & Spatial Fitting

Approach 1: The idea for this project was meant to be a continuation of the work of team Heidelberg 2014 and validation of their parts. Using their intein BioBrick we wanted to circularize our own protein, a Type III AFP. Using the CRAUT software, we ran our protein through the program. After fixing syntax errors, the following linkers were suggested:

    • Sequence 1:GGAEAAAKAARGKCWE
      Sequence 2:GGXXX RGKCWE
  • The sequences in red represent the extein sequences (scars that will be left after the intein reaction), and the sequences in purple represent a secondary structure (alpha-helix). Preliminary spatial evaluation using PyMOL suggested these were not ideal for linking the termini of our protein of interest. For the purpose of progression, we adapted the second sequence and opted for running simulations on our own linker designs.

    Approach 2:Using the extein sequence tested by Heidelberg 2014, linkers were built within PyMOL and Coot to effectively joined the N- and C- termini. Flexible linkers were chosen to enable the termini to arrange such that the protein remains in its functional conformation.

    • Linker 1:RGKCWEAA
      Linker 2:RGKCWEGAA
      Linker 3:RGKCWEGGAA
  • Modeling Methods & Stability Analysis

    Selected linkers, from Approach 2, were then run through molecular dynamic (MD) simulations using the GROMACS package . Following energy minimization, the stability of the protein under physiological conditions was determined. Protein files were edited to circularize the backbone and solvated in water at 298 K to mimic those functional conditions (Figure 2).

    Figure 2:Circularized anti-freeze protein structure after MD simulations (left). These images represent the final configuration of the energy minimized and dynamic simulation tests; A showing just the protein backbone and B showing the amino acid side chains.The upper loop shows the AFP's termini linked by the extein sequence and GAA linker (Linker 2). The linked termini are found opposite to the ice-binding surface of the AFP, which is found at the lower region of the images.


    Statistical analysis was used to compare the stability of each circularized protein with our 3 linkers found in Approach 2. Root mean square (RMS) deviation fluctuation plots are shown in Figures 3 and 4 below.


    Figure 3:RMS Deviation plot of the protein core. RMSD data was gathered over time for residued 3-63 of the antifreeze protein. Stability of our AFP+linker can analyzed and compared relative to the stability of the wild-type AFP.

    Figure 4:RMS Fluctuation of AFP Fluctuations of individual atons were compared between the wild type and each linker tested to determine their stability. Of greatest interset was the fluctuation of atoms at the ice binding sites of the protein(identified in blue)2, 3.

    Conclusions

    Table 1: Summarized results of the circularized AFP simulations.
    Linker RMSD Core Comparision to Wild Type Individual Atom Fluctuation Comparison Conclusion
    RGKCWEAA Decreased Stability Less stable at non-ice binding sites Too much strain on protein
    RGKCWEGAA Similar stability Increased stability at ice-binding sites Linker most stable of those tested
    RGKCWEGGAA Slightly less stable Slightly less stable at ice binding sites Linker too long


    Based on molecular simulations run, linker 2 theoretically creates the most stable configuration of our circularized AFP construct. These results serve as the basis for the design of sequences required for experimental testing. This is not to say that this linker is the optimal option for this procedure, it simply provides the best option of those modeled for analysis.


    AFP-SCAFFOLD DESIGN

    The Ice Queen

    Figure 5. Theoretical model of T3-10 scaffold. This 12-mer self-assembling scaffold is represented by the PDB file 4EGG.

    Scaffold

    A self-assembling protein scaffold introduces the ability to create a platform from which to create attachment between different subunits. Baker and Yeates have successfully created a number of such scaffolds with verified crystal structures4.

    After examination of the different constructs available, we have chosen to use the T3-10 model (Figure 5). This single subunit 12-mer has a less complicated assembly procedure as compared to a 2-unit scaffold such as T33-21. The position of the C-terminus points away from the assembled structure enabling the attachment of an engineered part. Expression and assembly requirements are easily met at room temperature from a single vector. With 12 identical subunits, the scaffold could potentially house 12 antifreeze proteins.






    Antifreeze Protein

    The Type III antifreeze protein from ocean pout, the same used in the circularization project was selected for the scaffold system as well (See Figure 1). This is a well characterized protein, whose sequence was easily accessible by QGEM this year.

    The E/K-coil system

    We sought a strong, highly efficient means to connecting AFPs to the self-assembling scaffold. After considering our options, we opted for the E/K coil method used by Calgary iGEM 2013. These non-covalent interactions are highly specific and should enable selective binding between AFPs and the scaffold subunit. Read more about coiled-coils on our Background page.

    The coil sequences to be used:

    • K3 (K-coil):KIAALKEKIAALKEKIAALKE
      E3 (E-coil):EIAALEKEIAALEKEIAALEK
  • Upon examination of their work and the sequences used, we noted a discrepancy in their data. According to the PDB file and NMR structure elicited in 2004 5, the coils interact in a parallel fashion, incorrectly identified as antiparallel in the registry: K coil and E coil. This has now been reviewed by us to indicate the correct interaction according to the PDB file 1U0I (Take a look at the experience pages for both the E coil and K coil to read about our changes).

    Figure 6. Theoretical AFP-Scaffold Complex. Parallel coiled coil interactions selectively attach the AFP to the 12-mer scaffold. The blue ends of the coils represent the N-terminus, and the red the C-terminus.

    To generate a model of our complex, the E-coil was fused to the Type III AFP and the K-coil was fused to the scaffold subunits. The expected interaction is a parallel alignment as depicted in Figure 6. The introduction of a flexible region between the C-terminus of the scaffold subunit and the coil is introduced to allow the coil movement to enable an interaction to occur without steric hindrance from the protein units.

    Coiled-Coil Stability Simulations

    Before using these coils we sought out to test their stability and method of interaction in its native form. The coils selected for our project is represented by the PDB file 1U0I, an NMR-solved crystal structure. These coils are described to interact in a parallel fashion (Figure 7a). To simulate this interaction, the individual coils were solvated and run thorugh MD simulations in GROMACS.

    To simulate an antiparallel coiled-coil interaction, the K-coil's amino acid sequences were inverted such that it sat antiparallel to the E-coil. The file was then run though PyRosetta docking programs to undergo energy minimization, generating the most stable conformation. This file, with separated coil subunits, was then also run through GROMACS simulations (Figure 7b).

    Figure 7. Pairs of coiled-coil interactions. E and K coils can interact in either a A parallel formation, or B an antiparallel formation. Blue represents the N-terminus of each subunit and red the C-terminus.

    Stability Analysis

    Comparison of the parallel and antiparallel coiled coil interaction was performed with consideration to both the Lennard Jones and Coulomb potential energies as calculated using GROMACS software. The Lennard-Jones potential is a mathematical model that approximates the intermolecular forces between two molecules 6 and Coulomb potential describes the interaction between point charges. Statistical analysis was carried out on the energy output for each file and can be found summarized in Table 2 below.

    Table 2. Energy comparison of parallel and antiparallel coiled coils. Values are understood to be comparable energy units and a more negative energy is indicative of a more stable interaction.
    Potential Mean Median
    Parallel Coils Coulomb -210.6366881 -209.408646
    Lennard-Jones -131.8912492 -132.159454
    Antiparallel Coils Coulomb -200.6834072 -205.490387
    Lennard-Jones -147.8004057 -149.036652

    These values suggest that when compared with the Lennard-Jones potential, the antiparallel orientation is more stable while the Coulomb potential suggests the parallel is slightly more stable. This discrepancy creates uncertainty in the theoretical knowledge of preferred orientation for these engineered coils. For the basis of this project, it is presumed that the NMR structure as determined by Lindhout et al. is the preferred orientation and thus the coils should interact in a parallel orientation.

    PYROSETTA DOCKING & LINKER TESTING

    AFP & Scaffold Docking

    In order to test the self-assembly of the AFPs and scaffold proteins with E/K coils, docking simulations were run. These were used to assess the energetic stability of the coiled coil interaction and determine the orientation of the ice-binding surface of the AFP. This was done using PyRosetta by following the standard procedure for initial low resolution docking prior to high resolution docking on favourable protein structures. Sorting of low energy dockings was used with consideration given to the proximity of the E/K coils to choose a final selection of proteins for refinement and scoring. The general procedure used is outlined in Figure 8 and the final docked structure reached shown in Figure 9.

    Figure 8. Flow chart of general protocol used to determine configuration of docked AFP/scaffold complex. Flow chart starts with PDB files and structures of the proteins of interest, and involves isolating the most stable conformations for further analysis; T3-10 Scaffold and Type III AFP.
    Figure 9. Final alignment of AFP and scaffold coiled-coil interaction after docking and refinement.Scaffold subunits are shown in yellow with a rainbow K-coil to demonstrate directionality. An AFP is shown in grey and the ice binding residues are highlighted in orange.

    Circularization Program

    In order to run molecular dynamic simulations of the circularized AFP, the linkers were individually fit with the scar sequence to optimize positioning to allow the terminal nitrogen and carbon groups in bond length proximity (approximately 1.3 angstroms). To save time and energy in spatially testing multiple linkers, a script was written to utilize PyRosetta’s capabilities to circularize the protein for easy visual assessment of atom positioning. This involved using the small mover method in PyRosetta to allow each movement to be checked in ‘fitting’ the linker to ensure the dihedral angles were within acceptable ranges. The script for this program can be located here.


    REFERENCES

    1. The PyMOL Molecular Graphics System, Version 1.5.0.4 Schrödinger, LLC.

    2. Chao et al. (1994). "Structure-function relationship in the globular type III antifreeze protein: Identification of a cluster of surface residues required for binding to ice". Protein Science. 3(10):1760-1769.

    3. Sonnichsen et al. (1996) "Refined solution structure of type III antifreeze protein: hydrophobic groups may be involved in the energetics of the protein-ice interaction". Structure. 4(11):1325-1337.

    4. King et al. (2012). "Computational Design of Self-Assembling Protein Nanomaterials with Atomic Level Accuracy." Science. 336:1171-1174

    5. Lindhout et al. (2004)."NMR solution structure of a highly stable de novo heterodimeric coild-coil." Biopolymers. 75:367-375.

    6. Atkins and de Paula (2006). "Atkins' Physical Chemistry". 8th edn, W.H. Freeman. pg.637.