Team:WLC-Milwaukee/Modeling
Modeling
Here is where we should put a brief description of the content that fits under the SURVEYS category:
In order to probe further into the feasibility of our project, we decided we needed to address the fact that bacteria evolve in response to bacteriophages. The basis of our project is that gram-negative bacteria without a working tolC gene (be it knocked-out, misfolding, or incomplete) will be unable to form a complete antibiotic-efflux protein complex. As a result bacteria without a functioning tolC gene will show an increased sensitivity to antibiotics normally resisted through efflux; we demonstrated this experimentally using a Kirby-Bauer assay and the antibiotics Novobiocin and Erythromycin .
Since evolution is driven by random mutations being selected for, we thought this was an achievable simulation. In order to simulate this, we would need a way to simulate the insertion of random non-silent mutations into a tolC gene/protein, a way to predict whether or not these mutations were damaging to the function of the TolC protein, and a way to predict whether or not these would affect TolC-mediated phage binding. For the sake of simplicity we considered only point mutations to the coding-region of the DNA. We divided the mutations into 4 main types:
- A nonsense mutation which turns an amino acid into a stop codon; this will prevent a full TolC protein from being translated, preventing both antibiotic efflux and phage binding.
- A missense mutation which prevents the TolC monomers from forming a trimer. In these we included strongly polar or charged amino acids being mutated into the beta-barrel regions, the insertion of Valine, Threonine, Isoleucine, Serine, Aspartate, Asparagine, or Proline in the alpha helices (these amino acids are not accommodated in an alpha helix) which are critical for TolC trimerization, and in addition 1/4 of any other mutations in the alpha helices (we assume the amino acids on one face of each helix are critical for interaction with other monomers; there are ~3.6 amino acids per turn).
- A base-pair substation in the extracellular loops of the TolC protein. These mutations do not affect the assembly or function and therefore do not affect antibiotic resistance. We assumed any of these would result in phage-resistance.
- Any mutation not fitting into the above three categories we assumed would not affect assembly/function or the ability of bacteriophages to recognize the protein. Therefore these show no effect on phage resistance or antibiotic sensitivity.
Example of mutation program input (E. coli tolC sequence)
With these parameters, we constructed a program to simulate random point mutations. C++ was chosen for its universality as well as our familiarity with it. At the base of the program is a codon class; this contains room for 3 characters to be filled with 3 nucleotides from the DNA sequence. When filled or altered, the class analyzes itself to determine which amino acid it represents, its polarity, and its charge. Codons have the ability to mutate themselves (pick one of their three constituent bases and change it to one of the other three nucleotides), as well as the ability to return their contents, polarity, charge, and amino acid. Codons were organized and used in a single-linked list controlled by a polypeptide class. The polypeptide class stored a string of higher-level information about each codon in a parallel single-linked list; the information stored here was used to store what secondary structure the specific amino acid was in. When told to mutate, the Polypeptide class used a random number generator (rand() from stdio.h) seeded with the current time to pick which amino acid will be mutated; the randomly selected amino acid is then told to mutate.
Handling Input/Output as well as analysis of the mutations (for us, this included predicting function destruction and if it affected phage binding) was done in a main class, so the underlying classes could easily be adapted for other projects. Input of the DNA sequence was done by parsing a plain-text sequence from a file. The amino acid properties is input from a plain text file and parsed as an integer and a string in which the string is the property of the amino acid and the integer designates how many amino acids it applies to. The user inputs how many mutations are desired through the CLI. Information about each mutation is outputted into a csv file (comma separated values) and at the end the program prints the frequency of amino acids changing their charge (or lack of it), changing their polarity, turning into premature stops as well as the percentage of mutations that are predicted to destroy function of the TolC and the percentage of mutations which would interfere with phage binding. Assigning the tolC codons to a specific secondary structure was based on previous research .
Example of the file used to associate secondary structures with codons
Here is where we should put a brief description of the content that fits under the RESULTS category:
stufffffffffffffff
Here is where we should put a brief description of the content that fits under the Explore category: