Difference between revisions of "Team:UCSC/Software"

Line 414: Line 414:
 
<p>Below is an example program which illustrates the above steps and explains how to access one or more of the functions from ProteinParam.</p>
 
<p>Below is an example program which illustrates the above steps and explains how to access one or more of the functions from ProteinParam.</p>
 
<p><strong>4.1 Ex) pIFinder.py</strong></p>
 
<p><strong>4.1 Ex) pIFinder.py</strong></p>
 +
<p>
 +
<img src="https://static.igem.org/mediawiki/2015/c/cc/Cristian_Figure_1.png" />
 +
</p>
 
<p>Figure 1: This program is known as pIFinder, which specifically utilizes the pI ( ) method of the class ProteinParam and the FastAreader class to calculate and print the isoelectric point of given protein sequences with their respective headers.</p>
 
<p>Figure 1: This program is known as pIFinder, which specifically utilizes the pI ( ) method of the class ProteinParam and the FastAreader class to calculate and print the isoelectric point of given protein sequences with their respective headers.</p>
 
<p>Notice that in the red box there is the line &ldquo;import sequenceAnalysis&rdquo;, which signifies that the capabilities of sequenceAnalysis are now available to your program. Also, notice that the lines that which the two arrows are pointing to are responsible for creating a FastAreader object and a ProteinParam object.</p>
 
<p>Notice that in the red box there is the line &ldquo;import sequenceAnalysis&rdquo;, which signifies that the capabilities of sequenceAnalysis are now available to your program. Also, notice that the lines that which the two arrows are pointing to are responsible for creating a FastAreader object and a ProteinParam object.</p>
Line 432: Line 435:
 
       <div class="panel-body">
 
       <div class="panel-body">
 
         <p>In order to test whether your program is working correctly, we have provided a test file that includes eleven FASTA formatted protein sequences. This test file can be found on the 2015 UCSC iGEM wiki under the name &ldquo;sequenceAnalysisTest.txt&rdquo;. Make sure that this file is also saved in the same directory as sequenceAnalysis and the program that you are writing. Furthermore, also make sure that whether you are hardcoding the file name into your code, or submitting it via the command line, that the file name matches exactly the way it is written.</p>
 
         <p>In order to test whether your program is working correctly, we have provided a test file that includes eleven FASTA formatted protein sequences. This test file can be found on the 2015 UCSC iGEM wiki under the name &ldquo;sequenceAnalysisTest.txt&rdquo;. Make sure that this file is also saved in the same directory as sequenceAnalysis and the program that you are writing. Furthermore, also make sure that whether you are hardcoding the file name into your code, or submitting it via the command line, that the file name matches exactly the way it is written.</p>
 +
<p style=" text-align: center;">
 +
<img src="https://static.igem.org/mediawiki/2015/c/cc/Cristian_Figure_1.png" />
 +
</p>
 
<p><br />Figure 2: The results for the test file.</p>
 
<p><br />Figure 2: The results for the test file.</p>
 
       </div>
 
       </div>

Revision as of 22:03, 18 September 2015

Software at UCSC

F.O.C.U.S


Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.
Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.
Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.

sequenceAnalysis


1.1 What is it for?

The sequenceAnalysis program is a tool for providing efficient, large scale analysis of amino acid sequences. It works very much like the software ProtParam designed by ExPASy. However, unlike ProtParam, this program has the advantage of reading and computing the physical and chemical properties of multiple proteins all at once.

1.2 Program Specifications

This program was written using the most recent version of the Python programming language, Python 3.4.3. It is designed to work as module, which can be imported to any program to access all or specific functions that the user requires. An added benefit of the modular design is that it can be easily edited to provide further functionality. At this point, this program is made up of two classes: ProteinParam and FastAReader.

This class was written by UCSC iGEM team members Cristian Camacho, Jairo Navarro, and Raymond Bryan. It was developed in the course BME 160: Research Programming in the Life Sciences, and serves as the backbone for two other programs that are being submitted: CodonBiasGenerator and FOCUS.

2.1 Attributes

  • aa2mw : A dictionary of the molecular weights of all 20 amino acids
  • mwH20 : A float value corresponding to the molecular weight of water
  • aa2abs280 : Dictionary of the absorbance values of Tyrosine, Tryptophan and Cysteine at a wavelength of 280 nm.
  • aa2chargePos : Dictionary of the positive charge values of Lysine, Arginine and Histidine
  • aa2chargeNeg : Dictionary of the negative charge value of Aspartic Acid, Glutamic Acid, Cysteine and Tyrosine.
  • aaNterm : Float value of the charge
  • aaCterm : Float value of
  • validAA : An empty dictionary which will contain the counts of valid amino acids in a specific protein sequence.

2.2 Methods

  • aaCount( ) : Iterates through the amino acid sequence and returns a single integer count of valid amino acid characters found.
  • aaComposition ( ) : Returns the validAA dictionary with the valid amino acids and their counts for a specific protein.
  • pI ( ) : Estimates the theoretical isoelectric point of a protein by iterating through every pH value until it finds the one that results in a net charge that is closest to zero.
  • charge ( ) : Calculate the net charge at a particular pH, using the pKa of each charged Amino acid and the Nterminus and Cterminus
  • molarExtinction ( ) : Estimates the molar extinction coefficient based on the number and extinction coefficients of tyrosines, tryptophans, and cysteines.
  • massExtinction ( ) : Calculates the mass extinction by dividing the molar extinction value by the molecular weight of the corresponding protein.
  • molecularWeight ( ) : Calculates a proteins molecular weight by summing the weights of the individual Amino acids and excluding the waters that are released with peptide bond formation.

Class FastAreader

This program was developed by Professor David Bernick of UC Santa Cruz, for the upper- division course BME 160: Research Programming in the Life Sciences. This class is what allows the sequenceAnalysis module to read and calculate the characteristics of multiple protein sequences at the same time, as long as they are in the FASTA format.

3.1 Attributes

  • fname : The initial file name to be ready by FastAreader.

3.2 Methods

  • doOpen ( ) : Checks if a file name is given to FastAreader, and if not, waits for a file to be given through system.in. This function provides command line usability.
  • readFasta ( ) : Using filename given in init, returns each included FastA record as 2 strings - header and sequence. If a filename is not provided, std.in is used. Whitespace is removed, no adjustment is made to sequence contents. The initial '>' is removed from the header.








The use of the sequenceAnalysis module is fairly simple. Here are the following steps for making use of it in your programs:

  1. Download the file named sequenceAnalysis from either the UCSC iGEM 2015 wiki, or the 2015 iGEM GitHub.
  2. Important, save the file in the same directory as the script that you are writing.
  3. Make sure to include the following line “import sequenceAnalysis” before writing any code for your new program.
  4. In order to use a function from either the ProteinParam class or FastAreader class, you must create an object for that class.
  5. Then you can use that object to access any of the available function from that class.

Below is an example program which illustrates the above steps and explains how to access one or more of the functions from ProteinParam.

4.1 Ex) pIFinder.py

Figure 1: This program is known as pIFinder, which specifically utilizes the pI ( ) method of the class ProteinParam and the FastAreader class to calculate and print the isoelectric point of given protein sequences with their respective headers.

Notice that in the red box there is the line “import sequenceAnalysis”, which signifies that the capabilities of sequenceAnalysis are now available to your program. Also, notice that the lines that which the two arrows are pointing to are responsible for creating a FastAreader object and a ProteinParam object.

 

In order to test whether your program is working correctly, we have provided a test file that includes eleven FASTA formatted protein sequences. This test file can be found on the 2015 UCSC iGEM wiki under the name “sequenceAnalysisTest.txt”. Make sure that this file is also saved in the same directory as sequenceAnalysis and the program that you are writing. Furthermore, also make sure that whether you are hardcoding the file name into your code, or submitting it via the command line, that the file name matches exactly the way it is written.


Figure 2: The results for the test file.

Codon Bias Generator


Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.
Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.
Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.