Difference between revisions of "Team:Toronto/Software"

Revision as of 23:04, 18 September 2015

will fix format of references when it's completely done

Brief Description:

Bacteria generally occur in communities, whether they be in dirt, in water, in the air, on skin or in the gut. In microbial communities, the survival of all species are interdependent; the biochemical and behavioral activities of one species provides the necessary metabolites and living environment for another [1]. Many approaches have been developed for predicting flux distributions in the metabolic network of one species using flux balance analysis (FBA) in order to optimize for biomass or product formation [2,3]. Flux balance analysis has been used for a variety of applications, including drug target identification by evaluation of gene essentiality, knowledge-gap filling of metabolic models and metabolic engineering of E. coli for lycopene synthesis [4-6]. However, algorithms to perform FBA at a community level have been rare and complex (often using non-linear programming and presenting high difficulty for solvers). In community FBA (cFBA), the exchange of metabolites between species, the biomass, relative fitness, and competitive ability of each species affects metabolic flux within the community as well as within each individual species [4-6]. MetaFlux, a web tool developed by the Toronto iGEM Team, carries out cFBA between bacterial species custom-chosen by the user via a linear-programming algorithm, and displays the results through an interactive and easily-understandable node-edge visualization.

Web Application: Framework of the MetaFlux Interface

MetaFlux was developed using D3.JS, a JavaScript library for creation of interactive networks using nodes and edges. The object of the MetaFlux web tool is to visualize and manipulate community level metabolic networks in the tool’s “extracellular view” in addition to visualizing species-specific cytoplasmic networks in the “cytoplasmic view”, to present flux distributions in each of these views, and to present changes in flux distributions occurring from any alterations made to the metabolic network. Asynchronous calls to the backend optimize the network using an iGEM Toronto Python script, which in turn, uses COBRApy, and a constraint-based modeling package was used to model metabolic networks from metabolic models in the form of SBML (Synthetic Biology Markup Language) XML data.

Data Collection

The SBML (Systems Biology Markup Language) and XML (Extension Markup Language) were gathered and obtained from bioinformatics websites such as Model SEED^[10] and EMBL-EBI^[11-13] as well as directly from some research articles^[14-19]. All SBML and XML files collected were used to contribute and help calculate the FBA (Flux Balance Analysis) in MetaFlux. What we looked for when searching for SBML and XML files of our selected species were certain specific compartments such as extracellular ('compartment id=”e”' or 'compartment id=”e0”') or intracellular ('compartment id=”i”').

Algorithm

Firstly, we create metabolic models in SBML file format for each individual species that are present in the community. Each metabolic model is tailored for content, and include external metabolites and reactions contributed from other species in the community. Thus, the extracellular space of each metabolic model is unique to each individual species, despite the fact that all members belong to the same community. After the creation of individual metabolic models, we use COBRApy to optimize for each model's biomass objective function, and subsequently store the solutions in text files. Using the solutions, we calculate and store the averages and standard deviations for all shared reactions in the community in a new text file. Following this, we change the upper bound and lower bound of each reaction of all the species’ models to the average flux value plus two standard deviations and the average flux value minus two standard deviations, respectively. With our new constraints on shared reactions, we perform flux balance analysis iteratively for each model with COBRApy once again, optimizing for each respective biomass objective function. We then store each of the flux values returned by the objective function for each species in a new text file. We take the flux values and calculate standardized z-scores by comparing all values. Fractional biomass coefficients are then calculated for each species by taking each species' respective z-score and diving by the sum of the z-scores for all species; these values will also be stored in a new text file. (The sum of all fractional biomass coefficients should be equal to one.) Lastly, a community metabolic model will be created where species are treated as if they were the metabolic compartments of a community 'organism'. Further, the constraints in the model and/or variables in the objective function for the community model will be weighed by their respective fractional biomass coefficients, depending on which species the constraint or variable belongs to. Constraints and/or variables for reactions that are shared between species will be weighed as the sum of the fractional biomass coefficients for the species involved. In our final step, COBRApy will be used to optimize for the community biomass objective function, which is defined as the weighed summation of the biomass objective functions of all species. The resulting vector of fluxes is expected to be representative of real-world experimental data.

Web Application: User Interaction

With MetaFlux, users have the ability to choose between viewing displays of the extracellular metabolic network of one species, the extracellular metabolic network of multiple species, the cytoplasmic-periplasmic metabolic network of a single species, or a specific metabolic pathway within one species. The visualization will include small circular nodes that represent metabolites, hexagonal nodes that display reactions, big circular nodes that represent species, and arrows to define a particular pathway between the nodes. Users have the ability to add in pathways (metabolites and reactions) and remove pathways either from data provided by our web tool or based on outside sources of experimental data. FBA calculations and optimizations will occur on the backend and results will be displayed on the network as visually discernable changes in the thickness of the arrows. Users can also zoom into the network and reposition the network and its subparts via mouse-dragging. Finally, users will have the option of storing the experiment and the settings for each optimization for future use.

References

Zelezniak, Aleksej et al. “Metabolic Dependencies Drive Species Co-Occurrence in Diverse Microbial Communities.” Proceedings of the National Academy of Sciences of the United States of America 112.20 (2015): 6449–6454. PMC. Web. 9 Sept. 2015. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4443341/
Radhakrishnan Mahadevan, Jeremy S. Edwards, Francis J. Doyle III, Dynamic Flux Balance Analysis of Diauxic Growth in Escherichia coli, Biophysical Journal, Volume 83, Issue 3, September 2002, Pages 1331-1340, ISSN 0006-3495, http://dx.doi.org/10.1016/S0006-3495(02)73903-9. (http://www.sciencedirect.com/science/article/pii/S0006349502739039)
Jong Min Lee, Erwin P. Gianchandani, and Jason A. Papin Flux balance analysis in the era of metabolomics Brief Bioinform 2006 7: 140-150.
Raman K, Rajagopalan P, Chandra N. Flux balance analysis of mycolic acid pathway: targets for anti-tubercular drugs. PLoS Comput Biol 2005;1:e46.
Oberhardt MA, Puchalka J, Fryer KE, et al. Genome-scale metabolic network analysis of the opportunistic pathogen Pseudomonas aeruginosa PAO1. J Bacteriol2008;190:2790-803.
Alper H, Jin Y-S, Moxley JF, et al. Identifying gene targets for the metabolic engineering of lycopene biosynthesis in Escherichia coli. Metab Eng 2005;7:155-64.
Khandelwal RA, Olivier BG, Röling WFM, Teusink B, Bruggeman FJ (2013) Community Flux Balance Analysis for Microbial Consortia at Balanced Growth. PLoS ONE 8(5): e64567. doi:10.1371/journal.pone.0064567
http://journal.frontiersin.org/article/10.3389/fmicb.2014.00125/full
http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002363
Henry, C.S., et al. High-throughput Generation and Optimization of Genome-scale Metabolic Models. Nature Biotechnology 28, 977-982 (2010)
Li C., Courtot, M., Novere N.L., and Laibe C. BioModels.net Web Services, a free integrated toolkit for computational modeling software. Brief Bioinformatics, 11 (3), 270-277 (2010)
Buchel F., et al. Path2Models: large-scale generation of computational models for biochemical pathway maps. BMC Systems Biolgoy 7 (116), doi:10.1186/1752-0509-7-116 (2013)
Novere, N.L., et al. BioModels Database: a free, centralized database of curated, published quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res. 34 (Database issue), D689-D691 doi:10.1093/nar/gkj092 (2006)
Orth J.D., et al. A comprehensive genome-scale reconstruction of Escherichia coli metabolism – 2011. Mol Syst Biol. 7 (535), doi: 10.1038/msb.2011.65 (2011)
Sohn S. B., Kim T. Y., Park J. M. and Lee S. Y. In silico genome-scale metabolism analysis of Pseudomonas putida KT2440 for polyhydroxyalkanoate synthesis, degradation of aromatics and anaerobic survival. Biotechnology Journal. 5 (7), doi: 10.1002/biot.2010000124 (2010)
Imam S., et al. iRsp1095: A genome-scale reconstruction of the Rhodobacter sphaeroides metabolic network. BMC Syst Biol. 5 (116), doi: 10.1186/1752-0509-5-116 (2011)
Sengar R. S., and Papoutsakis E. T. Genome-Scale Model for Clostridium acetobutylicum: Part I. Metabolic Network Resolution and Analysis. Biotechnol Bioeng. 101 (5), 1036-1052 (2008)
Nogales J., et al. Detailing the optimality of photosynthesis in cyanobacteria through systems biology analysis. Proc Natl Acad Sci USA. 109 (7), 2678-2683 (2012)
Gonnerman M. C., et al. Genomically and biochemically accurate metabolic reconstruction of Methanosarcina barkeri Fusaro, iMG746. Biotechnology Journal. 8 (9), 1070-1079 (2013)

@@ Line 42: / Line 42: @@
 <div class="content">
 <ul>
-<li>will fix format of  references when it&#39;s completely done</li>
+<li>will fix format of  references when it&#39;s completely done
+<img src="" alt="hi"></li>
 </ul>
 <h3 id="brief-description-">Brief Description:</h3>
-<p>Whether they be in dirt, in water, in the air, on skin or in the gut, bacteria
+<p>Bacteria generally occur in communities, whether they be in dirt, in water, in the air, on skin or in the gut. In microbial communities, the survival of all species are interdependent; the biochemical and behavioral activities of
-generally occur in communities. In microbial communities, the survival of all
+one species provides the necessary metabolites and living environment for
-species are interdependent due to the biochemical and behavioral activities of
-one species that provide the necessary metabolites and living environment for
 another [1]. Many approaches have been developed for predicting flux
 distributions in the metabolic network of one species using flux balance
@@ Line 55: / Line 54: @@
 target identification by evaluation of gene essentiality, knowledge-gap filling
 of metabolic models and metabolic engineering of E. coli for lycopene synthesis
-[4-6]. However, algorithms to perform FBA at a community level have been few and
+[4-6]. However, algorithms to perform FBA at a community level have been rare and
-complicated (often using non-linear programming and very difficult to solve);
+complex (often using non-linear programming and presenting high difficulty for solvers).
-since in community FBA (cFBA), the exchange of metabolites between species, the
+In community FBA (cFBA), the exchange of metabolites between species, the
-biomass, relative fitness and competitive ability of each species affect
+biomass, relative fitness, and competitive ability of each species affects
-metabolic flux within the community and within each individual species [4-6].
+metabolic flux within the community as well as within each individual species [4-6].
 MetaFlux, a web tool developed by the Toronto iGEM Team, carries out cFBA
-between user custom-chosen bacterial species with a linear-programming algorithm
+between bacterial species custom-chosen by the user via a <strong>linear-programming algorithm</strong>,
-and displays the results by an interactive and easily-understandable node-edge
+and displays the results through an interactive and easily-understandable node-edge
 visualization.</p>
 <h3 id="web-application-framework-of-the-metaflux-interface">Web Application: Framework of the MetaFlux Interface</h3>
@@ Line 68: / Line 67: @@
 interactive networks using nodes and edges. The object of the MetaFlux web tool
 is to visualize and manipulate community level metabolic networks in the tool’s
-“extracellular view” in addition to visualizing species-specific cytoplasmic
+<em>“extracellular view”</em> in addition to visualizing species-specific cytoplasmic
-networks in the “cytoplasmic view”, to see flux distributions in each of these
+networks in the <em>“cytoplasmic view”</em>, to present flux distributions in each of these
-views, and to see changes in flux distributions occurring from any alterations
+views, and to present changes in flux distributions occurring from any alterations
 made to the metabolic network. Asynchronous calls to the backend optimize the
-network using an iGEM Toronto Python script which in turn, uses COBRApy, a
+network using an iGEM Toronto Python script, which in turn, uses COBRApy, and a
 constraint-based modeling package was used to model metabolic networks from
 metabolic models in the form of SBML (Synthetic Biology Markup Language) XML
 data.</p>
+<h3 id="data-collection">Data Collection</h3>
+<p>The SBML (Systems Biology Markup Language) and XML (Extension Markup Language) were gathered and obtained from bioinformatics websites such as Model SEED<sup>[10]</sup> and EMBL-EBI<sup>[11-13]</sup> as well as directly from some research articles<sup>[14-19]</sup>. All SBML and XML files collected were used to contribute and help calculate the FBA (Flux Balance Analysis) in MetaFlux. What we looked for when searching for SBML and XML files of our selected species were certain specific compartments such as extracellular (&#39;compartment id=”e”&#39; or &#39;compartment id=”e0”&#39;) or intracellular (&#39;compartment id=”i”&#39;).</p>
 <h3 id="algorithm">Algorithm</h3>
-<p>Firstly, we create metabolic models, in SBML file format, for each individual
+<p>Firstly, we create metabolic models in SBML file format for each individual
-species that are present in the community. Each metabolic model is tailored to
+species that are present in the community. Each metabolic model is tailored for
-contain extra external metabolites and reactions that are contributed from other
+content, and include external metabolites and reactions contributed from other
 species in the community. Thus, the extracellular space of each metabolic model
-is unique to each individual species despite the fact that all members belong to
+is unique to each individual species, despite the fact that all members belong to
 the same community. After the creation of individual metabolic models, we use
-COBRApy to optimize for each model’s biomass objective function and subsequently
+COBRApy to optimize for each model&#39;s biomass objective function, and subsequently
 store the solutions in text files. Using the solutions, we calculate and store
 the averages and standard deviations for all shared reactions in the community
-in a new text file. With this, we change the upper and lower bounds, of each
+in a new text file. Following this, we change the upper bound and lower bound of each
-reaction of all the species’ models, to the average flux value plus two standard
+reaction of all the species’ models to the average flux value plus two standard
-deviations and the average flux value minus two standard deviations
+deviations and the average flux value minus two standard deviations,
-respectively. With new constraints on shared reactions, we perform flux balance
+respectively. With our new constraints on shared reactions, we perform flux balance
-analysis again iteratively for each model with COBRApy, again optimizing for
+analysis iteratively for each model with COBRApy once again, optimizing for
-each respective biomass objective function. We then store each flux value
+each respective biomass objective function. We then store each of the flux values
-returned by the objective function of each species in another new text file. We
+returned by the objective function for each species in a new text file. We
-take the flux values and calculate z-scores compared to each other. Fractional
+take the flux values and calculate standardized z-scores by comparing all values. Fractional
-biomass coefficients will be calculated for each species by taking their
+biomass coefficients are then calculated for each species by taking each species&#39;
-respective z-score and diving over the sum of z-scores for all species and will
+respective z-score and diving by the sum of the z-scores for all species; these values
-be stored in another text file. The sum of all fractional biomass coefficient
+will also be stored in a new text file. (The sum of all fractional biomass coefficients
-should equal to one. Lastly, a community metabolic model will be created where
+should be equal to one.) Lastly, a community metabolic model will be created where
-species are treated as just additional compartments. However, the constraints in
+species are treated as if they were the metabolic compartments of a community &#39;organism&#39;.
-the model and/or variables in the objective function for this community model
+Further, the constraints in the model and/or variables in the objective function for the
-will be weighed by their respective fractional biomass coefficients depending on
+community model will be weighed by their respective fractional biomass coefficients, depending on
 which species the constraint or variable belongs to. Constraints and/or
 variables for reactions that are shared between species will be weighed as the
-sum of the fractional biomass coefficients for the species involved. The final
+sum of the fractional biomass coefficients for the species involved. In our final step,
-step is to then use COBRApy to optimize for the community biomass objective
+COBRApy will be used to optimize for the community biomass objective
-function, which is defined as the weighed summation of biomass objective
+function, which is defined as the weighed summation of the biomass objective
-functions of all species. The resultant vector of fluxes is predicted to be
+functions of all species. The resulting vector of fluxes is expected to be
 representative of real-world experimental data.  </p>
 <h3 id="web-application-user-interaction">Web Application: User Interaction</h3>
-<p>With MetaFlux, the user has the ability to choose to display the extracellular
+<p>With MetaFlux, users have the ability to choose between viewing displays of the extracellular
 metabolic network of one species, the extracellular metabolic network of
 multiple species, the cytoplasmic-periplasmic metabolic network of a single
-species, or a certain metabolic pathway within one species. The visualization
+species, or a specific metabolic pathway within one species. The visualization
-will include small circular nodes to represent metabolites, hexagonal nodes to
+will include small circular nodes that represent metabolites, hexagonal nodes that
-display reactions, big circular nodes to represent species and arrows to define
+display reactions, big circular nodes that represent species, and arrows to define
-a particular pathway between the nodes. The user has the ability to add in
+a particular pathway between the nodes. Users have the ability to add in
-pathways (metabolites and reactions) and remove pathways either from the web
+pathways (metabolites and reactions) and remove pathways either from data provided by our web
-tool’s available data or based on their own experimentally-collected data. FBA
+tool or based on outside sources of experimental data. FBA
-calculations and optimizations will occur on the backend and display the results
+calculations and optimizations will occur on the backend and results will be displayed
-on the network as visually distinguishable changes in the thickness of the
+on the network as visually discernable changes in the thickness of the
-arrows. The user also has the capability of zooming into the network and
+arrows. Users can also zoom into the network and
-repositioning the networks and its subparts by mouse-dragging. Finally, the user
+reposition the network and its subparts via mouse-dragging. Finally, users
 will have the option of storing the experiment and the settings for each
 optimization for future use.</p>
@@ Line 143: / Line 144: @@
 <li><a href="http://journal.frontiersin.org/article/10.3389/fmicb.2014.00125/full">http://journal.frontiersin.org/article/10.3389/fmicb.2014.00125/full</a></li>
 <li><a href="http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002363">http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002363</a></li>
+<li>Henry, C.S., et al. High-throughput Generation and Optimization of Genome-scale Metabolic Models. Nature Biotechnology 28, 977-982 (2010)</li>
+<li>Li C., Courtot, M., Novere N.L., and Laibe C. BioModels.net Web Services, a free integrated toolkit for computational modeling software. Brief Bioinformatics, 11 (3), 270-277 (2010)</li>
+<li>Buchel F., et al. Path2Models: large-scale generation of computational models for biochemical pathway maps. BMC Systems Biolgoy 7 (116), doi:10.1186/1752-0509-7-116 (2013)</li>
+<li>Novere, N.L., et al. BioModels Database: a free, centralized database of curated, published quantitative kinetic models of biochemical and cellular systems. Nucleic Acids Res. 34 (Database issue), D689-D691 doi:10.1093/nar/gkj092 (2006)</li>
+<li>Orth J.D., et al. A comprehensive genome-scale reconstruction of Escherichia coli metabolism – 2011. Mol Syst Biol. 7 (535), doi: 10.1038/msb.2011.65 (2011)</li>
+<li>Sohn S. B., Kim T. Y., Park J. M. and Lee S. Y. In silico genome-scale metabolism analysis of Pseudomonas putida KT2440 for polyhydroxyalkanoate synthesis, degradation of aromatics and anaerobic survival. Biotechnology Journal. 5 (7), doi: 10.1002/biot.2010000124 (2010)</li>
+<li>Imam S., et al. iRsp1095: A genome-scale reconstruction of the Rhodobacter sphaeroides metabolic network. BMC Syst Biol. 5 (116), doi: 10.1186/1752-0509-5-116 (2011)</li>
+<li>Sengar R. S., and Papoutsakis E. T. Genome-Scale Model for Clostridium acetobutylicum: Part I. Metabolic Network Resolution and Analysis. Biotechnol Bioeng. 101 (5), 1036-1052 (2008)</li>
+<li>Nogales J., et al. Detailing the optimality of photosynthesis in cyanobacteria through systems biology analysis. Proc Natl Acad Sci USA. 109 (7), 2678-2683 (2012)</li>
+<li>Gonnerman M. C., et al. Genomically and biochemically accurate metabolic reconstruction of Methanosarcina barkeri Fusaro, iMG746. Biotechnology Journal. 8 (9), 1070-1079 (2013)</li>
 </ol>