Team:Fudan/Design

Design

DESIGN
Inspiration

Circular RNAs (circRNAs), formed by non-sequential back-splicing of pre-mRNA transcripts, are a wide- spread form of non-coding RNA in animal cells. Various function of natural existed circRNA revealed recently shows that circRNA can be used as powerful tools in future research and health care. However, there is no toolbox to generate circRNA up to date, which restrict the research and application of circRNA. Our project focus on the devices to cyclize specific part of RNA, aiming to start a circRNA revolution.

The inspiration of our project comes from recent research progress in circRNA biogenesis. The most plausible mechanism for circularization is back- splicing mechanism
( Starke, Stefan, et al. "Exon Circularization Requires Canonical Splice Signals." Cell Reports56.1(2014):103–111.) In the back splicing machenism, the branch point (BP) upstream of the circularizing exon would use its 2’-hydroxyl group to attack the downstream 5’splice site, yielding a branched Y-structure intermediate. In the second step, the newly generated 3’-hydroxyl end of the circularizing exon would attack the upstream 3’splice site, releasing a circular exon.

The back-splicing system is based on the splicing signals and the approach in space of both exon ends. Based on this system, we applied different approach to this problem and designed three types of devices to cyclize the RNA based on the back-splicing mechanism: the “Ouroboros”(cyclizing device based on the inverted repeat sequence in the exon-flanking region), the “Cyclizer”(proteins that accelerate RNA cyclization) and the acRNA(ssRNA that accelerate RNA cyclization). All of these devices pull the two ends of the circularizing exon and initiate the splicing process. One of the most plausible application for circRNA generated by our devices is to regulate oncomiRs.MicroRNAs (miRNAs) are short non-coding RNAs expressed in different tissue and cell types that suppress the expression of target genes. Certain miRNAs, called oncomiRs, play a causal role in the onset and maintenance of cancer when overexpressed. (Cheng, Christopher J., et al. "MicroRNA silencing for cancer therapy targeted to the tumour microenvironment." Nature518(2014):107-110.) Due to the resistance to exonucleases, circRNAs sponge are promising tools for regulating oncomiRs.

Apart from the usage of oncomiR sponge, our device can also be applied to the circRNA research. We designed experiments with our device to measure the half-life time of certain circRNA, and also experiments to use our device to facilitate investigation of miRNA functions. To help testing our device, we built a luciferase reporter to report concentration of mir-21.

Ouroboros

“Ouroboros”is the plan A of our cyclizing device. We used the inverted repeat sequence, which is inserted into the flanking intron of the circularizing sequence.

We generate long inverted repeat sequence of 400bp and insert this sequence into the flanking intron of the circularizing exon. We choose the inserted sequence which have higher binding affinity to increase the cyclization efficiency. The flanking intron is beta-globin intron1, which can improve the expression level of our device RNA,and provide long linker for the back-splicing mechanism. (Ashwal-Fluss, Reut, et al. "circRNA Biogenesis Competes with Pre-mRNA Splicing." Molecular Cell56.1(2014):55–66.)

The prototype of “Ouroboros” is to generate circRNA sponge to regulate oncomiRs, and this is based on the long half-life time of circRNA. In order to validate this assumption, we designed circRNA degradation experiment to measure circRNA half-life time. (For detailed protocol of our experiment, see our notebook In this experiment, we use our device to cyclize the RNA and simulate the prototype in which the “Ouroboros” are used as research device.

The circularizing exon have 6 mir-21 binding site, which is reported to be the most efficient number for miRNA sponges. (Ebert, Margaret S, J. R. Neilson, and P. A. Sharp. "MicroRNA sponges: competitive inhibitors of small RNAs in mammalian cells." Nature Methods4.9(2007):721-6.) The circularizing exon is constructed by overlap pcr, and the binding site is designed to have the highest binding affinity.

In order to build a tool box, we are synthesize the inverted sequence and the circularizing sequence separately, and finally combine these two parts together. Other than mir-21 sponge, we also designed mir-17 sponge circularizing exon. Our toolbox will finally able to regulate different kinds of miRNAs, and facilitate the research and health care issues related to miRNA.

To support our testing experiment, we designed a reporter Luciferase device. We inserted two mir-21 binding site to the 3’-UTR of the Luciferase, and it can report the mir-21 concentration by the intense of fluorescent.

Cyclizer

In our plan B,we adopted a more efficient and powerful way to regulate RNA cyclization. Simon J. Conn et al (Conn, Simon J., et al. "The RNA Binding Protein Quaking Regulates Formation of circRNAs." Cell160.6(2015):1125–1134.), shows that QKI plays a vital role in cyclization determination. This protein sequence specifically binds to flanking region of a RNA(exon) to be cyclized(we simply call it pre-circle RNA), then it pulls both ends of the pre-circle RNA close to each other and initiate the back-splicing process. Interestingly, no evidence up to date shows that QKI plays any role in recruiting downstream proteins and all clues shows that it only pull exon ends close to each other. In addition, cyclization could be induced by inserting inverted repeat sequence into the flanking region, which will form a stem-loop structure, in both ends of pre-circle RNA. Therefore, we boldly propose that once both ends(close enough to SA/SD region of the exon to be cyclized) of a specific RNA pulls close to each other stably, splicing process for circle RNA formation will be started by spliceosome. This hypothesis is also strongly supported by our data from plan A. Therefore, we could regulate cyclization process and cyclize any target RNA by engineering RNA-binding protein.

Our aim is to design a device to cyclize any RNA of our interest (these RNAs must have specific SA/SD sequence on both ends), this tool should have two features: 1) It can pull the both ends close to each other; 2) It can bind to specific ssRNA sequence and can be easily reprogramed.

For the first point, we check the crystal structure of QKI with its RNA substrate, which shows that QKI pulls both ends of the RNA through homodimerization. Besides, those two ends separate at around 50Å in an antiparallel orientation and this spatial distribution is quite stable in consequence of rigid dimerization interactions. Since we can’t thoroughly convince the cyclization mechanism equals making ends close, what we can do is to imitate QKI as much as possible. So the protein we design shouldn’t become a trouble for the downstream splicing factors binding to RNA. To realize the stable, antiparallel and distance-fixed spatial positioning of both ends of RNA, we creatively draws up the “3C” strategy “Cyclize Circle RNA by Circle protein”. In details, we want to use two RNA binding proteins to bind the two ends of RNA respectively. The two proteins will be fused end-to-end with proper linker, which will form a circle conformation and the two protein will be position in an antiparallel orientation naturally.
In turn, the two ends of RNA became antiparallel. However, this strategy slightly ignore the rigid separation of the ends, even rigid linkers cannot guarantee a rigid and stable separation like what realized by homodimerization of protein backbones. Certainly, the best way to guarantee this separation seems to design a rigid scaffold linking the two protein component by computer aided de novo protein design. But weighing the cost (especially success rate) and reward, the latter is obviously unnecessary.

For the second point, we luckily found the RNA binding protein named PUF(pumilio), which could sequence specifically bind to target RNA motif with Kd between pM to nM and could be easily reprogramed to bind different targets. Its RNA binding domain (we will directly call it PUF in the following) is composed of eight tandemly repeating subdomains, presenting a crescent structure
(Xiaoqiang, Wang, et al. "Modular recognition of RNA by a human pumilio-homology domain.." Cell110.4(2002):501-512.). Each subdomain independently binds to one specific nucleotide so that we can easily change its binding motif by changing relative subdomain. In fact, the sequence specificity is only corresponding to two core amino acids in each subdomain. Aleksandra Filipovska et al(A universal code for RNA recognition by PUF proteins) has already create a codon table [table1] for sequence recognition. Here we give a mutation helper to help anyone interesting in it to generate his own PUF. Ideally, one could easily create a PUF protein specifically recognize an 8-nt RNA motif by mutating each pair of core sites in every subdomain (mutate 16 sites in total). However, some reports shows that not all reprogram mutant works as expected (e.g. unable to express, low binding affinity) (Cheong, C. G., and T. M. Hall. "Engineering RNA sequence specificity of Pumilio repeats." Proceedings of the National Academy of Sciences of the United States of America103.37(2006):13635-13639.). In conclusion, this modular and programmable protein perfectly meets our needs, and we believe we can solve such problem by trying different recognition motif.

Toolbox

In addition to “3C” strategy, we ambitiously plan to create a full-scale circle RNA solution toolbox, which include a low efficiency but pretty convenience cyclization tool, a stable and high efficiency tool and an easily controllable and efficiency tool. Therefore, we extend our plan B to three parts.

First, we want to regulate cyclization by acRNA(auxiliary circularization RNA). As shown above, construct a brand new specific PUF needs to mutate 16 amino acid sites, which means a lot of time consuming(through site-directed mutations of PUF gene) or money consuming(through sequence synthesis). If only there were a much more convenience way to realize quick test of cyclization. Therefore, we design the acRNA to realize it. This RNA is a short (~180nt) linear RNA composed of two binding region and a short linker. The principle of start cyclization is the same as “3C” strategy. We use the two binding region on acRNA to bind the double ends of pre-circle RNA through complementary base-pairing and then the acRNA will pull the two flanking region of the exon to be cyclized close to each other, forming a ‘U’ structure.
However, the amount of RNA is far less than that of protein, even with the most efficient promoter, the number of acRNA molecular wouldn’t overpass that of pre-circle RNA too much. Most importantly, the total number of any specific RNA is quite small (Schwanhäusser, Björn, et al. "Global quantification of mammalian gene expression control.." Nature473.7347(2011):337-42.), which means it’s hard for acRNA and pre-circle RNA crashing into each other in cell nucleus. Therefore, acRNA cannot efficiently pull ends close like designed proteins, which will greatly lower cyclization efficiency. But if this strategy could work, any shortage will be covered by its great convience.

For the second tool, we name the protein we designed “CyCli2er”. Following the 3C strategy, we first fuse two PUF(human pumilio1,construct 828-1179) with a carefully selected linker. Because our linker design is empirical, we propose two different kinds of linker: rigid linker and flexible linker. Besides, each kind of linker has several different specific sequence. We composite widely used artificial flexible/rigid linker and natural linker from linker database (See: http://www.ibi.vu.nl/programs/linkerdbwww/) and adjust the length of it at 40-50Å. A canonical flexible linker contains two hydrophilic regions and a SV40 NLS inserted between them. The hydrophilic sequence we used includes GS linker and natural linker obtained from linker database by length and secondary structure. To achieve a better rigid separation and benefit the relatively folding of two tandem PUF domain, we further design several rigid linker. The same as flexible linkers, these linkers also contain three parts: two helical regions and a SV40 NLS. The helical region is predicted to present α-helix structure with length adding up to 40-50 Å (including SV40 NLS). In addition, we specially puts some hydrophilic residues like serine or glycine at each ‘seam’ between every parts to provide some flexibility, which is quite important for overcoming steric hindrance when the PUF domain forms a right-angle with the linker.
But now we only make a tandem fusion protein but not a circle protein with double linker. Protein cyclization is achieved by using Intein, a self-splicing protein. We fuse InteinC and InteinN of Npu DnaE (See: http://parts.igem.org/Part:BBa_K1362000) to the N-terminal and C-terminal of the fusion protein talked above, respectively. Also, such a fusion was linked by carefully designed linker to guarantee Intein’s correct folding and form another linker when the fusion protein is cyclized. Principles of designing this linker is the same as the former as shown in the picture. Comparing with linear protein, the double linker in such a circle protein masterly restricts the orientation of flanking region of pre-circle RNA bound to PUF, which in turn restricts the thermal motion of regions flanking SA/SD sequence and forms a stable adjacence.

Though protein designed by “3C” strategy seems pretty good, an evident shortage is that once it is expressed in a cell, it starts to launch cyclization all the time. But as we all know, circle RNA is not something like protein or plasmid – we don’t need to make it accumulating in the cell to purify it. It’s much more like miRNA or other molecular switches. What we want is to use it to regulate some metabolic activities just like our plan A. So, it’s a vital thing to ‘turn the switch off’ since we can turn it on (generate circle RNA) and also turn it on whenever we want. To accomplish this, we design our third tool, the LyCli2er. LyCli2er is a protein made by tandemly fusing the two PUF with a pair of photoactivatable protein that can be heterodimerized when exposed to blue light. This engineered photoswitches reported by Kawano et al is named Magnets, which composed of two slightly different proteins named pMag and nMagHigh1 respectively. Those two proteins could dimerize with each other when exposed to blue light(about 470 nm) and disassociate when extinguishing the light. Therefore, we fused pMag to the N-terminal of one PUF and nMagHigh1 to the C-terminal of another respectively. If the cell grows in blue light, the two PUF will dimerize through Magnet, which in turn pull close the two ends of pre-circle RNA and launch the cyclization.
If one want to stop the producing of this circRNA or just don’t want it to appear, just make the cell grow in dark. Though the controllable cyclization is quite exciting, whether it can work and how efficiency it will be is unknown. We cannot suppose the effect that we only pull both ends close but ignore the orientation and stability. However, we strongly believe this will only result in a lower efficiency because unstable pulling close will lower the possibility to start downstream splicing pathway but it still could produce circle RNA.