Difference between revisions of "Team:BostonU/Modeling"

Line 65: Line 65:
 
<body>
 
<body>
  
<br>
+
<br><br>
 
<table cellspacing="50px" align="center">
 
<table cellspacing="50px" align="center">
 
<td>
 
<td>

Revision as of 17:42, 13 September 2015

Modeling


Modeling

One of the main parts of our project was developing a model to help us predict the best places to split a protein. A model was previously developed in Matlab by Billy Law, and we built off of this model in our project. The overall goal of our model was to find places to split the protein that would create inert halves but still have robust activity when put back together through induction.

Proteins are comprised of long strings of amino acids. Theoretically, given a protein that is n amino acids long, there are n-1 places to split the protein, since you can split it between each amino acid. This would be unfeasible and too time consuming, so we focused on two major criteria in order to narrow down split sites.

Our first criteria was to choose the hydrophilic regions of the protein to split. We know that proteins generally have a hydrophobic core and a hydrophilic surface, and we hypothesized that splitting a protein through its core could potentially interfere with its folding activity and function. Therefore, we focused on avoiding hydrophobic regions in the protein and targeting the hydrophilic regions.

We used the Janin hydrophobicity scale, which assigns each amino acid a number based on how hydrophobic it is (the higher the number is, the more hydrophobic the amino acid is). We took a running average of the hydrophobicity of 11 consecutive amino acids in our model to create a hydrophobicity curve of the entire protein.

Our second criteria was to avoid the secondary structures in the protein: the alpha helices and beta sheets. We hypothesized that splitting through these sheets could also potentially disrupt folding activity and function. We used an online tool (http://www.compbio.dundee.ac.uk/jpred/) to predict where there would be alpha helices and beta sheets in the protein. This tool required the primary structure (amino acid sequence) of the protein as the input, and output the secondary structure prediction.

Additionally, we wanted to avoid splitting at special catalytic residues in our protein. We wanted to avoid these amino acids since we attributed them to certain functions of the protein through previous literature, and we wanted to prevent any interference with activity.

Using these criteria, we were able to build and manipulate a model in Matlab that would help us predict the best places to split our proteins. Below are the graphs produced by our model, along with the split sites that we chose shown in black. We realized that these criteria do not account for a protein’s 3-D structure. As a result, our model ignores the loops and turns in between alpha helices and beta sheets. Loops and turns are structures in the protein that can contribute the most to protein function, such as binding sites, and can be identified in a protein’s 3-D structure.

However, not all proteins have known 3-D structures. We therefore conclude that our model can be used when 3-D structures are unknown, but in order to best identify the most viable split sites, it is more beneficial to examine the 3-D structure of a protein.

Matlab Code Split Site Identification