Difference between revisions of "Team:Waterloo/Modeling/Cas9 Dynamics"
Line 141: | Line 141: | ||
<section id="discussion" title="Discussion"> | <section id="discussion" title="Discussion"> | ||
<h2>Discussion</h2> | <h2>Discussion</h2> | ||
+ | <img href="https://static.igem.org/mediawiki/2015/1/18/Waterloo_Situation_1.png"> | ||
</section> | </section> | ||
Revision as of 00:54, 19 September 2015
Modelling Genomic Effects of CRISPR/Cas9
CRISPR/Cas9 has been extensively studied for its applications in eukaryotic genome editing and gene expression control. Last year, the Waterloo iGEM team created an ODE model of dCas9 binding and control of gene expression. This year, however, the modelling team chose to investigate the effects of CRISPR/Cas9 on an genomic rather than molecular level. Specifically, we wanted to model the accumulation of mutations in a target genome and eventual deactivation of target genes after cutting by CRISPR/Cas9 and repair by Non-Homologous End Joining (NHEJ).
Model Formation
When bound to a single guide RNA (sgRNA), the S. pyogenes Cas9 nuclease diffuses through the cell in three dimensions, searching for the sequence 'NGG' in the target genome . When it finds an 'NGG', known as a PAM site, Cas9 binds and undergoes a conformational change that allows it to unwind the DNA helix and compare the sequence of its sgRNA with the DNA. If the sgRNA matches well, Cas9 cleaves the DNA, producing a double-stranded break (DSB) 3-4 bp upstream of the PAM site .
In the absence of a template, DSBs are repaired by Non-Homologous End Joining (NHEJ), which is an error-prone process that sometimes creates indels at the site of repair . This effect has recently been exploited to target double-stranded viruses such as HBV . Though there have been extensive efforts to characterize the factors that contribute to effective targeting and deactivation by CRISPR/Cas9 and NHEJ, they have not, to the best of our knowledge, been synthesized into a single model.
The aim of the model is thus to capture the cutting events initiated by Cas9 and predict the outcomes of these events. We model each genome as containing multiple domains of interest, such as promoters or ORFs, and track whether these domains have been deactivated by CRISPR/Cas9 activity. There may be more than one sgRNA target in each domain and many domains can be targeted at once.
If Cas9 successfully cuts at a target site, the double-stranded break may be resolved in three ways. The most common resolution is for NHEJ to successfully repair the DSB without creating any indels
. However, NHEJ repair is error-prone and will often indels at the target site. Finally, since multiple sgRNA targets are considered, it is possible that large deletions will occur between two targets that are simultaneously cut.At each timestep, the model considered the state (cut or uncut) and sequence of all targets and computes the probability of the following events at each target: CRISPR/Cas9 cutting, NHEJ repair or large deletion. The remainder of the model formation section discusses how we determined the probability of each event.
Probability of Double-Stranded Cuts made by CRISPR/Cas9
Taking into account target effects
Cas9 diffuses in three dimensions until it finds PAM sites.
Error-Prone Repair by Non-Homologous End Joining
Indel Probabilities
Large Deletions
Other Model Parameters and Assumptions
Genomes do not interact
: we expect there to be multiple viral genomes in our plant defense example and it is possible that simultaneous cuts on different genomes could result two genomes being joined together. we decided that multiple stochastic simulations could be averaged to get an overall pictureTalk about where all the probabilities come from
Software Implementation
Genome Classes
The code uses three classes to model the genome. Genomes have domains which have targets. Targets handle probabilities, domains track functionality and genome modifies everything.
class Target(): is associated with a domain class Domain(): has targets is associated with a genome class Genome(): has domains
Genome Simulation
The simulation calls these classes to check if events have occurred and the details of each event. At the end it compiles the data logs into CSVs, plots and visualizations.
for dt in time_steps: call genome_classes to check if there was a cut, repair or large deletion if event: add to log generate CSVs, plots and visualizations
To see all the code for the simulation, check out our GitHub Page
Results
Model Validation
Include notes on how the model matches reality/our expectations of reality in this section.
Simulate w/ targets that mismatch to different extents.
Effect of sgRNA Strength
Matt visualizations for different sgRNAs.
Graph of 3 different sgRNA designs of different strengths, show % functional
Importance of Large Deletions
Include notes on how the model matches reality/our expectations of reality in this section.
Effect of Cas9 Concentration
Include notes on how the model matches reality/our expectations of reality in this section.
Predicting CRISPR Plant Defense
This model was applied to the CRISPR Plant Defense aspect of our project, investigating whether the P6 protein of Cauliflower Mosaic Virus (CaMV) could be deactivated by frameshift mutations. The P6 protein was chosen as a focus of the investigation because it suppresses natural plant RNAi defenses and trans-activates translation of other CaMV proteins . Details on P6 and the CaMV genome can be found on CaMV Biology page.
The model was run with HOW MANY targets in the P6 gene of the simulated CaMV genome described above. We tracked the percent of simulated genomes with functional P6 across 1000 runs fo the model, giving a general prediction of how long it will take before the P6 of a particular CaMV genome is rendered non-functional by our Plant Defense system.
PLOT % functional for P6/time over many simulations.