Team:DTU-Denmark/Procject/Background
Introduction
Unlike proteins, nonribosomal peptides are not synthesized through translation, but rather by sequential condensation of amino acids by large multimodular enzymes called nonribosomal peptide synthetases (NRPS) [1,2]. In fact, it is these complex pattern of condensation of amino acids that attributes to the activity of these compounds. NRPSs do not rely on an external template for synthesis of their products. They can be divided up in modules that each are responsible for incorporating one additional amino acid onto the growing peptide chain much like an assembly line (Figure MISSING#) [3]. At the end of the assembly line, the growing peptide is released as a linear peptide or through cyclization.
Figure #MISSING NRPS are highly modular. Tyrocidine Synthetase (tycA-C) consists of ten modules, two epimerization domains, and thioesterase (TE) domain. Tyrocidine precursors are assembled one-by-one in an assembly line mannar. The TE domain cyclises the tyrocidine upon releasing it from the active site.
Each module, with the exception of the initiation, consists of at least three domains. These domains are responsible for activating the monomer (adenylation (A) domain), holding the activated monomoer (peptidyl carrier protein (PCP) domain), and amino acid condensation (condensation (C) domain). The C-domain is lacking in initation modules and the terminal module contains an additional required domain responsible for termination and release of the product called thioesterase (TE) domain [2].
Adenylation domains
The adenylation domains are responsible for activating and attaching the amino acid monomers to the PCP domain. They act as gatekeepers, ensuring that only the desired monomers are incorporated. They may be highly specific, only binding a single amino acid, or as in the case with Tyrocidine, see below, allow multiple similar amino acids to be incorporated in the peptide [3]. In the late 1990's the so called specificity-conferring code of A domains was revealed [4]. This code consists of the 10 amino acids, which were identified by sequence alignments to the first solved A domain crystal structure PheA [5], that are responsible for substrate binding (READ MORE LINK TO FURTHER DOWN MISSING).
Peptidyl carrier protein domains
The monomers are covalently bound to the PCP domains through thioester bonds. Before they can accept the monomers, they must be activated by posttranslational modification. A 4'-phosphopantetheinyl transferases (PPTase) transfers a 4'-phosphopantetheine (4'-PP) moiety, carrying the sulfhydryl group required for thioester bond formation, from coenzyme A to a conserved serine residue in the PCP domain [2].
Condensation domains
Elongation of the peptidyl chain is performed by the C domains, which catalyze the condensation of the peptidyl chain bound to the upstream PCP domain and the amino acid bound to the downstream PCP domain. These domains have a strong stereoselectivity and may have some specificity towards the side chain of the amino acid incorporated by the A domain of the same module, whereas little specificity has been observed towards the peptidyl chain [6].
Thioesterase domains
The TE-domain catalyses the final step of NRP synthesis. It is the TE-domain that determines if the peptide is relased or cyclised to form a cyclic structure. Despite, the important role of TE-domains they are the least understood in the pathway. While it is possible to predict A-domain specificity, it is not possible to predict TE affinity [7]. The tyrocidine TE-domain (Figure #MISSING)Some studies have shown that some TE domains are highly flexible; fo
MIS
Engineering of NRPS
With a pool of more than 500 different amino acid monomers, a peptide of only nine monomers have almost infite possibilities of composition. Yet, to our knowledge, improvement of existing NRP drugs through genomic engineering of NRPS in vivo still remains unaccomplished. This is perhaps surprising as it for long has been possible to predict NRPS clusters/operon by genomic mining and in addition predict NRP products by analysis of adenylation domains [7-9]. A classical engineering approach is therefore to simple combine modules with known specificity into a synthetic assembly line. Two years ago, the iGEM team from Heidelberg tried to imrpove synthesis of synthetic peptides by combining different modules (MISSING LINK TO THEIR WIKI). In addition, they implemented a NRPS module (indC) which produces a blue compound and they combined this module with other NRPS module to produce tagged peptides. Such a tagging system would greatly improve the synthesis of synthetic NRPS-derived peptides, but to our knowledge, it has not been possible to repeat the experiment within the group at Heidelberg since then. The reason why synthetic modular-based design of NRPS is not yet possible, may be explained by recent insight into NRPS structure provided by structural analysis [5,10]. Crystallization of NRPS is complicated by their large size and mobile structure, but domain-specific crystallisation and other structural analysis has highlighted that specific interactions between individual modules in NRPS is crucical for the catalytic acivity and subsequently transfer of the growing peptide to the next module [10].
Domain | Position (Stachelhaus code) | Similarity | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
235 | 236 | 239 | 278 | 299 | 302 | 322 | 330 | 331 | 317 | ||
Aad | E | P | R | N | I | V | E | F | V | K | 94% |
Ala | D | L | L | F | G | I | A | V | L | K | 55% |
Asn | D | L | T | K | L | G | E | V | G | K | 90% |
Asp | D | L | T | K | V | G | H | I | G | K | 100% |
Cys(1) | D | H | E | S | D | V | G | I | T | K | 96% |
Cys(2) | D | L | Y | N | L | S | L | I | W | K | 88% |
Dab | D | L | E | H | N | T | T | V | S | K | 100% |
Dhb/Sal | P | L | P | A | Q | G | V | V | N | K | 83% |
Gln | D | A | Q | D | L | G | V | V | D | K | 100% |
Glu(1) | D | A | W | H | F | G | G | V | D | K | 95% |
Glu(2) | D | A | K | D | L | G | V | V | D | K | 95% |
Ile(1) | D | G | F | F | L | G | V | V | Y | K | 92% |
Ile(2) | D | A | F | F | Y | G | I | T | F | K | 100% |
Leu(1) | D | A | W | F | L | G | N | V | V | K | 99% |
Leu(2) | D | A | W | L | Y | G | A | V | M | K | 100% |
Leu(3) | D | G | A | Y | T | G | E | V | V | K | 100% |
Leu(4) | D | A | F | M | L | G | M | V | F | K | 97% |
Orn(1) | D | M | E | N | L | G | L | I | N | K | 100% |
Orn(2) | D | V | G | E | I | G | S | I | D | K | 98% |
Phe | D | A | W | T | I | A | A | V | C | K | 88% |
Phg/hPhg | D | I | F | L | L | G | L | L | C | K | 80% |
Pip/Pip-@ | D | F | Q | L | L | G | V | A | V | K | 75% |
Pro | D | V | Q | L | I | A | H | V | V | K | 87% |
Ser | D | V | W | H | L | S | L | I | D | K | 90% |
Thr/Dht | D | F | W | N | I | G | M | V | H | K | 91% |
Tyr(1) | D | G | T | I | T | A | E | V | A | K | 100% |
Tyr(2) | D | A | L | V | T | G | A | V | V | K | 80% |
Tyr(3) | D | A | S | T | V | A | A | V | C | K | 78% |
Val(1) | D | A | F | W | I | G | G | T | F | K | 96% |
Val(2) | D | F | E | S | T | A | A | V | Y | K | 94% |
Val(3) | D | A | W | M | F | A | A | V | L | K | 95% |
Variability | 3% | 16% | 16% | 39% | 52% | 13% | 26% | 23% | 26% | 0% | |
Table 1 Stachelhaus code showing position in Stachelhaus code (as defined my multiple alignment) with A-domain specificity [4]. The data consists of 160 A-domain and is adapted after Stachelhaus et al. 2009. |
Prediction of adenylation domain specificity
antiSMASH predicts NRPS clusters by patially Hidden Markov Models. It also predicts a consensus product by integrating three different methods for predictions of adenylation domain specificity These methods are: NRPSpredictor2, Stachelhaus, and Minova et al. [7,11]. Stachelhaus et al. alligned the binding pocket of 160 adenylation domains using a structural alignment approach. By trimming of the sequence 10 core amino acids encoding specificity in the binding pocket was identified giving rise to the Stachelhaus code (Table # MISSING). In addition, it was shown that by modying the binding pocket in silico substrate affinity could be altered or relaxed [4].
Prediction is sometimes complicated by the fact that adenylation domains sometimes show more or less variability in amino acid specificity. In addition, the Stachelhaus code show some redundance in the 160 sequences [4]. Tyrocidine is an example of this. Tyrocidine is a commercially available mixture of non-ribosomal antibiotic synthesized by Brevibacillus parabrevis. It consists of four decapeptides varying at three amino acids (MISSING Table #) and is synthetiszed by the NRPS, Tyrocidine Synthetase A-C, containing 1, 3, and 6 modules, respetively. (Figure # MISSING HIGHER UP MICHAEL).
Tyrocidine has an unique mode of acton wherein it disrupts the function of the cell membrane. Unfortunately, it has high toxicity towards human blood and reproductive cells and can only be used in topical applications. This makes tyrocidine an interesting target for drug improvement. Under MISSING section you can read more about improvement of tyrocidine.
|
||||
---|---|---|---|---|
Amino acid position |
||||
Tyrocidine |
3 |
4 |
7 |
|
A |
L-Phe |
D-Phe |
L-Tyr |
|
B |
L-Trp |
D-Phe |
L-Tyr |
|
C |
L-Trp |
D-Trp |
L-Tyr |
|
D |
L-Trp |
D-Trp |
L-Trp |
Potential NRP targets for drug improvement
In addition to tyrocidine, ciclosporin (or cyclosporin) is an important nonribosomal peptide drug used as an immunosuppressant in transplantations [12]. It consists of eleven amino acids which are cyclized upon releasing from the NRPS, like tyrocidine and contains D-amino acids and amino acids with modifications (Figure #). It was first isolated from the filamentous fungi Tolypocladium inflatum in 1969. In 1972, its function as immunosuppresant was discovered by Sandoz (now Novartis). Despite, its widely use in clinal applications it is associated with side effects (adverse drug reactions) [12]. In ciclosporin G which is also isolated from Tolypocladium inflatum, the a-aminobutyric acid residue in position 2 has been replaced by norvaline [13]. Ciclosporin G has reduced side effects in some clinical applications and it highlights the possibility of drug improvement by NRPS engineering. A total of 25 derivates of ciclosporin are known [-1]. Comparing the number of known derivates to the actual potential diversity in compounds of an 11-mer cyclic peptide (11500), only a very little fraction of potential compounds are known. Even substitution of a single amino acid would yield more than 5,000 different compounds that could be screened for improved function.
Hypothesis: Targetted engineering of adenylation domains using OGRE
As highlihgted above, there are multiple potential NRPS candidates that can be used for drug improvement through screening of NRP products, synthesized by modifying the Stachelhaus code of the adenylation domain. The limitation has been the limited genetic tools available for engineering of NRPS. Considering that the Tyrocidine Synthase is 1.25*106 Da which is approximately the size of the large subunit of the prokaryotic ribosome, this is perhaps not surprising. Even though that the changes required to potentially alter the specificity of the A-domain (~1-10 amino acids) according to the Stachelhaus code, transformation requires introduction of selection a casettes. As each NRPS module is ~1,500 amino acid and often multiple modules are encoded in one open reading frame, even few modifications require assembly and introduction of large expression casettes. For example the NRPS responsible to ciclosporin synthesis in Tolypocladium inflatum is encoded in a single gene simA encoding one 45.8 kb exon [-1]. Amplification of 45.8 kb nucleotides is not feasable with standard PCR and would be considerable expensive to synthesize by even cheap errorprone DNA synthesis methods (calculated based on prices from IDT for gblock synthesis).
We hypothesized that NRPS directed evolution targetting the Stachelhaus code could lead to improvement of NRP drugs. We proposed that the recent advantge in oligo mediated reecombineering (OGRE) using short single-stranded DNA (ssDNA) can be applied to generate this diversity. While OGRE has low efficiencies, it is actually an advantge for generation of libraries, as multiplex and automated targetting with a library of oligos will create a library of different compounds.
Oligo mediates recombineering
Recombination-mediated genetic engineering or recombineering (we call it OGRE) utlises homologous recombination to facilitate genetic modifications at any desired target by flanking the mutated sequence with homologous regions. One system for recombineering in E. coli is the λ phage derived λ Red, consisting of the genes encoding three proteins, Gam, Exo and Beta. Gam prevents degradation of linear dsDNA by inhibition of nucleases, Exo degrades dsDNA in a 5'-3' direction yielding ssDNA, and Beta facilitates recombination by binding to the ssDNA [14].
Multiplex Automated Genome Engineering (MAGE)
Wang et al. developed a method for rapid and efficient targeted evolution of cells through cyclical recombineering with ssDNA (oligo) in E. coli. Using this automated method, they more than five-fold improved lycoprene production in E. coli in three days [15]. Through MAGE it is possible to simultaneously target many different loci or target the same locus with a pool of multiple and/or degenerate oligos. By using multiple oligos targetting the same locus, it is possible to generate a library of mutants varying only at the target locus in a short amount of time. The genetic variation in the population of cells will be a function of the degenerate pool complexity and combinatorial arrangements of the modifications at different loci and can be used to modify e.g. the active site of an enzyme [15,16].
The MAGE protocol utilises the λ Red recombination system in combination with an (temporary) inactivation of the mismatch repair system and consists of seven steps that can be done with standard laboratory equipment [16]. As MAGE utilises oligos, only the Beta protein of the λ Red system is required. When E. coli is targetted with single-stranded oligos, Beta stabilises them inside the cell and facilitates homologous recombination into the genome (Figure #MISSING). Interestingly, this targetting is many-fold more effective, if the oligo is targetting the lagging strand compared to the leading strand [14,15,17,18]. Based on this observation in multiple organisms, it is propsed that Beta binds to the lagging strand of the replication fork and stabilises oligo integration between discontinious Okazaki fragments [14,18].
Figure #MISSING Proposed mechanism for function of Beta in oligo recombineering during DNA replication. Beta stabilises oligo that is incorporated between discontinious Okazaki fragments on the lagging strand [17]. Adapted after Carr et al. 2014.
MAGE becomes multiplex and automated, when the oligo recombinering steps are repeated multiple times. In short, the cells are first grown to mid-log phase, followed by induction of beta. The cells are then chilled to 4°C and washed in multiple steps to improve their competence. The oligos are added to the cells which are then electroporated. The cycle is repeated every 2-3 hours (in E. coli) allowing the cells to recover in between rounds of electroporations (Figure #MISSING) [16].
Figure #MISSING Protocol for MAGE induction. Repeated rounds of electroporation with ssDNA pool, followed by recovering period, induction of Beta, and preparations of cells for next round of MAGE. Credit: Michael Schantz Klausen.
Chip-based oligonucleotide synthesis
Traditional column-based oligo synthesis is costly for large scale MAGE experiments. Synthesis of 1,000 90-mer column-based oligos costs about 36,000 USD [19]. An alternative to this method is to use Microchip DNA arrays to synthesize the oligos on, with the advantage that the price scales with the number of chips instead of the number of oligos. Thus it is possible to have up to 12,472 130-mer oligos synthesized for 2,000 USD and up to 92,918 oligos for 5,000 USD (http://customarrayinc.com/oligos_main.htm).
This method however comes with a few disadvantages. The oligo amount comes in the picomolar range and is delivered as a single mix of all the oligos. Because of these disadvantages, an extra processing step is required before they can be used in MAGE experiments.
The oligos are synthesized with two 20 nucleotide flanking sequences (barcodes). These barcodes must include a thymidine immediately upstream and a DpnII restriction site immediately downstream of the oligo, while the rest of the barcodes can be designed for amplification with a specific primer. By using different barcodes, it is possible to design one oligo chip for multiple MAGE experiments. The thymidine allows amplification with an uracil-containing primer. In these primers U is substituted for T in the primer. The barcode can then be excited from the oligo using USER enzyme, DnpII, and a guide primer [19].
Oligo design
While the recombination frequency of MAGE can be increased by doing multiple cycles, care should be taken when designing oligos to ensure high efficiency for each individual cycle. There are several parameters that can be optimized to increase the recombination frequency.
The oligos should target the lagging strand of the replication fork as this is 10-100 times more efficient compared to targetting the leading strand, and the folding energy of the oligo should be considered, as it may form hairpins if it is too low, preventing incorporation.
The frequency is also dependent on the length of the oligo, as shorter oligos are less efficient due to their lower hybridization energy to the chromosome, while longer oligos have a higher tendency to form hairpins. 70-90-mer oligos seem to be the most efficient in E. coli [16].
Designing many optimized oligos for MAGE experiment is a time consuming task. Considering a 90-mer oligo with a single mismatch and 15nt homology arms results in 60 possible oligos with different secondary structures and consequently different recombineering efficiencies. Much of the time spent on designing oligos can be saved by using the online tool MODEST [20].
Oligo mediated recombineering in Bacillus subtilis
While the described λ Red reecombineering system is well exploited in E. coli the last years, the technology has not been adopted for genome editing to the same extend in other microorganisms. Application of λ Red reecombineering system has been described in Bacillus subtilis (ARTICLE 2012), but with longer oligos of approx. 2,000 nucletoides generated by PCR [21]. Besides expression of Beta, Sun et al. also expressed homologous recombinases from other phages. The gene product of region 35 (GP35) from the native B. subtilis phage SPP1 to Beta. Gene product of region 34 (GP34) encodes an endonuclease similar to λ Exo and GP35 is a recombinase protein homologous and GP36 is an ssDNA binding protein [22,23]. This native Bacilli reecombineering system yielded higher efficiency compared to λ Red in B. subtilis, but lower efficiency in E. coli [21]. Based on heteroloogus expression of many recombinases of different origin in Bacillus subtilis, it was included that recombineering efficiencies were optimised by using a recombinase derived from a phage which host was closely related to heterologous host. We noticed that codon usage of lambda beta is not optimal for expression of B. subtilis and that expression level (likeli due to codon usage) varried among the recombinases tested in the study. In addition, the length of the oligo tested is drastically different from the optimal length in E. coli based on the λ Red Lambda recombineering system.
Engineering of NRPS
Targetted engineering of adenylation domains
Oligo mediates recombineering
References
- Walsh, C. T. (2008). The Chemical Versatility of Natural-Product Assembly Lines. Acc. Chem. Res., 41(1), 4–10. doi:10.1021/ar7000414
- Finking, R., & Marahiel, M. A. (2004). Biosynthesis of Nonribosomal Peptides 1 . Annu. Rev. Microbiol., 58(1), 453–488. doi:10.1146/annurev.micro.58.030603.123615
- Strieker, M., Tanović, A., & Marahiel, M. A. (2010). Nonribosomal peptide synthetases: structures and dynamics. Current Opinion in Structural Biology, 20(2), 234–240. doi:10.1016/j.sbi.2010.01.009
- Stachelhaus, T., Mootz, H. D., & Marahiel, M. A. (1999). The specificity-conferring code of adenylation domains in nonribosomal peptide synthetases. Chemistry & Biology, 6(8), 493–505. doi:10.1016/s1074-5521(99)80082-9
- Conti, E. (1997). Structural basis for the activation of phenylalanine in the non-ribosomal biosynthesis of gramicidin S. The EMBO Journal, 16(14), 4174–4183. doi:10.1093/emboj/16.14.4174
- Lautru, S. (2004). Substrate recognition by nonribosomal peptide synthetase multi-enzymes. Microbiology, 150(6), 1629–1636. doi:10.1099/mic.0.26837-0
- Blin, K., Medema, M. H., Kazempour, D., Fischbach, M. A., Breitling, R., Takano, E., & Weber, T. (2013). antiSMASH 2.0--a versatile platform for genome mining of secondary metabolite producers. Nucleic Acids Research, 41(W1), W204–W212. doi:10.1093/nar/gkt449
- Medema, M. H., Blin, K., Cimermancic, P., de Jager, V., Zakrzewski, P., Fischbach, M. A., … Breitling, R. (2011). antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences. Nucleic Acids Research, 39(suppl), W339–W346. doi:10.1093/nar/gkr466
- Weber, T., Blin, K., Duddela, S., Krug, D., Kim, H. U., Bruccoleri, R., … Medema, M. H. (2015). antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters. Nucleic Acids Research, 43(W1), W237–W243. doi:10.1093/nar/gkv437
- Sundlov, J. A., Shi, C., Wilson, D. J., Aldrich, C. C., & Gulick, A. M. (2012). Structural and Functional Investigation of the Intermolecular Interaction between NRPS Adenylation and Carrier Protein Domains. Chemistry & Biology, 19(2), 188–198. doi:10.1016/j.chembiol.2011.11.013
- Minowa, Y., Araki, M., & Kanehisa, M. (2007). Comprehensive Analysis of Distinctive Polyketide and Nonribosomal Peptide Structural Motifs Encoded in Microbial Genomes. Journal of Molecular Biology, 368(5), 1500–1517. doi:10.1016/j.jmb.2007.02.099
- Lee, J.-H. (2010). Use of Antioxidants to Prevent Cyclosporine A Toxicity. Toxicological Research, 26(3), 163–170. doi:10.5487/tr.2010.26.3.163
- CALNE, R. (1985). CYCLOSPORIN G: IMMUNOSUPPRESSIVE EFFECT IN DOGS WITH RENAL ALLOGRAFTS. The Lancet, 325(8441), 1342. doi:10.1016/s0140-6736(85)92844-2
- Mosberg, J. A., Lajoie, M. J., & Church, G. M. (2010). Lambda Red Recombineering in Escherichia coli Occurs Through a Fully Single-Stranded Intermediate. Genetics, 186(3), 791–799. doi:10.1534/genetics.110.120782
- Wang, H. H., Isaacs, F. J., Carr, P. A., Sun, Z. Z., Xu, G., Forest, C. R., & Church, G. M. (2009). Programming cells by multiplex genome engineering and accelerated evolution. Nature, 460(7257), 894–898. doi:10.1038/nature08187
- Wang, H. H., & Church, G. M. (2011). Multiplexed Genome Engineering and Genotyping Methods. Synthetic Biology, Part B - Computer Aided Design and DNA Assembly, 409–426. doi:10.1016/b978-0-12-385120-8.00018-8
- Carr, P. A., Wang, H. H., Sterling, B., Isaacs, F. J., Lajoie, M. J., Xu, G., … Jacobson, J. M. (2012). Enhanced multiplex genome engineering through co-operative oligonucleotide co-selection. Nucleic Acids Research, 40(17). doi:10.1093/nar/gks455
- Gallagher, R. R., Li, Z., Lewis, A. O., & Isaacs, F. J. (2014). Rapid editing and evolution of bacterial genomes using libraries of synthetic DNA. Nature Protocols, 9(10), 2301–2316. doi:10.1038/nprot.2014.082
- Bonde, M. T., Kosuri, S., Genee, H. J., Sarup-Lytzen, K., Church, G. M., Sommer, M. O. A., & Wang, H. H. (2015). Direct Mutagenesis of Thousands of Genomic Targets Using Microarray-Derived Oligonucleotides. ACS Synthetic Biology, 4(1), 17–22. doi:10.1021/sb5001565
- Bonde, M. T., Klausen, M. S., Anderson, M. V., Wallin, A. I. N., Wang, H. H., & Sommer, M. O. A. (2014). MODEST: a web-based design tool for oligonucleotide-mediated genome engineering and recombineering. Nucleic Acids Research, 42(W1), W408–W415. doi:10.1093/nar/gku428
- Sun, Z., Deng, A., Hu, T., Wu, J., Sun, Q., Bai, H., … Wen, T. (2015). A high-efficiency recombineering system with PCR-based ssDNA in Bacillus subtilis mediated by the native phage recombinase GP35. Applied Microbiology and Biotechnology, 99(12), 5151–5162. doi:10.1007/s00253-015-6485-5
- Vellani, T. S., & Myers, R. S. (2003). Bacteriophage SPP1 Chu Is an Alkaline Exonuclease in the SynExo Family of Viral Two-Component Recombinases. Journal of Bacteriology, 185(8), 2465–2474. doi:10.1128/jb.185.8.2465-2474.2003
- Seco, E. M., Zinder, J. C., Manhart, C. M., Lo Piano, A., McHenry, C. S., & Ayora, S. (2012). Bacteriophage SPP1 DNA replication strategies promote viral and disable host replication in vitro. Nucleic Acids Research, 41(3), 1711–1721. doi:10.1093/nar/gks1290
Department of Systems Biology
Søltofts Plads 221
2800 Kgs. Lyngby
Denmark
P: +45 45 25 25 25
M: dtu-igem-2015@googlegroups.com