Team:Heidelberg/software/jaws
Abstract
Aptazymes, i.e. fusions of a catalytic ribozyme or DNAzyme with one or several aptamers, make construction of nucleic acids-based circuits responding to external stimuli possible, as their catalytic activity can be activated in either the presence or absence of the cognate ligand. The design of aptazymes has been hampered by the need for communication modules, which translate the binding of a ligand to the aptamer into a change of function in the catalytic nucleic acid. Standard procedure herefore is selection on the one hand and rational design on the other, resulting in tedious work which is not certain to succeed. To this day, only few attempts have been made to automatize the design process, which all target a specifically designed chassis ribozyme. Here we describe a software "JAWS" (Joining Aptamers Without Selection) using a criterion for computational selection of communication modules based upon the partition function, as well as energetic and entropic data. This is then used in a random search-like algorithm to extract optimal communication modules without the need for selection.
Introduction
Designing communication modules for controlling functional nucleic acids usually requires either the use of known modules
Algorithm
To generalize the design process of aptazymes we consider an aptamer inserted into a stem of the ribozyme of interest, adding a region of randomized nucleotides functioning as communication module. We constrain the active state of such an aptazyme to be such, that the aptamer is not bound by the rest of the structure, and all randomized nucleotides form a symmetric stem consistent with the consensus secondary structure of the native ribozyme. We constrain an inactive state to be such, that a part of the aptamer with a length of $N$ nucleotides forms a stem with the randomized nucleotides, and a portion of randomized nucleotides with length $N$. Hereafter, we shall call $N$ the shift of the communication module. By defining our communication module by a consensus structure, a length and a shift, we specify a constraint, checking for the existence of the active state consensus structure as well as the inactive structure, with a tolerance of $M$ wrong base pairs. We do so by inspection of the matrix of all base pair probabilities $P_{ij}$, yielded from the partition function $Z$ calculated for our sequence. We then discard all sequences not satisfying this constraint. Of the remaining sequences, only those that satisfy the constraint of the ensemble free energy difference between active and inactive state being less then the aptamer's interaction energy $\Delta{G_{Apt}}$ are retained. Using this approach, we arrive at functional communication modules for nucleic acids completely unrelated to the hammerhead ribozyme. In detail, the algorithm proceeds as follows:
Secondary structure prediction is performed using a partition function based approach using ViennaRNA
- Given the position of a region $\mathcal{R}\subset{Sequence}$ comprising $2\cdot{N}+A$ bases, which should surround the aptamer $\mathcal{A}$ with length $A$, construct an active conformation. Henceforth, we shall refer to $N$ as the stem length and to the active conformation as conformation $\mathcal{I}$. This conformation is such, that the aptamer does not participate in base-pairing interactions with the rest of the functional nucleic acid. Furthermore, all $2\cdot{N}$ nucleotides of the region $\mathcal{R}\diagdown{\mathcal{A}}$ should participate in base pairing interactions, forming a stem of length $N$.
- Given $R$, $N$ and a shift $S$, construct an inactive conformation, consisting of the first $N$ bases in $R$ pairing with the bases $N+A-S$ to $2\cdot{N}+A-S$. This conformation has a stem displaced by $S$ nucleotides from the active conformation, disturbing ribozyme activity. The inactive conformation shall be refered to as conformation $\mathcal{II}$.
- Generate a sequence $\mathcal{S}$, with all nucleotides in $\mathcal{R}\diagdown\mathcal{A}$ randomised.
- Compute the partition function $Z$ of $\mathcal{S}$ and the base pair probability matrix $P_{ij}$ at $\theta{ = 37°C}$. Check, if base pairs conforming to conformation $\mathcal{I}$ exist. This is true iff $P_{ij}\gt{P_{threshold}} \forall{(i,j)\in{\mathcal{I}}}$. If it is true, accept the sequence and move on to the next step. Else, return to step 3.
- Using the base pair probability matrix $P_{ij}$, check if base pairs conforming to $\mathcal{II}$ exits. This is true iff $P_{ij}\gt{P_{threshold}} \forall{(i,j)\in{\mathcal{II}}}$. Requiring simultaneous conformity to $\mathcal{I}$ and $\mathcal{II}$ ensures that the structure is bistable, with local minima of free energy coinciding with the active and inactive structures, respectively. If this is true, accept this sequence, otherwise discard this sequence and return to step 3.
- Compute the ensemble free energies $\Delta{G_{\mathcal{I}}}, \Delta{G_{\mathcal{II}}}$ for the conformations $\mathcal{I}$ and $\mathcal{II}$ respectively. Check if $\Delta{G_{\mathcal{I}}}\lt{E_{threshold}} \wedge \Delta{G_{\mathcal{II}}}\lt{E_{threshold}}$. This ensures that the secondary structures are stable and do not unravel easily at 37°C.
- Compute the difference in ensemble free energies $\Delta{G} = \Delta{G_{\mathcal{I}}} - \Delta{G_{\mathcal{II}}}$ and constrain it to be smaller in its absolute value than a threshold $\Delta{E_{threshold}}$. This ensures that the bistability of the structure is given, and the structure switches conformation upon ligand binding.
- If all of the above apply, accept the sequence as a candidate sequence for the aptazyme. Else return to step 3.
In terms of the optimisation of $N$ and $S$ it is sufficient to run one simulation with relaxed constraints and evaluate the two dimensional histogram of points $(S,N)$ to find the region in $S,N$ space with the highest amount of candidates generated. This should correspond to the optimal stem length and shift for the aptamer ribozyme combination to be considered.
Results
Our implementation of the algorithm, termed JAWS (Joining Aptamers Without SELEX), was done in Python using the ViennaRNA library and its Python bindings. JAWS, as well as its documentation, is freely available as Open Source on our GitHub repository. We used JAWS to create functional DNA and RNA constructs exhibiting switching behavior. Thus, we constructed variants of the SpinachII fluorescent aptamer, which were shown to be switchable in real time by the presence or absence of ATP to a fluorescent and non-fluorescent state, respectively, using an aptamer from
Discussion
Following the results from the switchable HRP assays, we implemented an additional constraint regulating the stem strength of our candidates. Namely, at each step enforce that $S[P_{ij}]=\Sigma_{ij}P_{ij}\cdot{log({P_{ij}}\cdot{L})}$ be inside an interval of entropies$[S_{0},S_{1}]$. This ensures at least in silico, that the DNAzymes generated possess a certain stem strength, and thus a specified dynamic range and kinetic behaviour. Furthermore, the process of generating ligand gated, bistable structures is easily expandable to switching structures which are not simple stems. This is already implemented in our software, as the constraint regarding the existence of basepairs conforming to a stem can actually process any given secondary structure. Even so, we are currently limited to simulating DNAs and RNAs without any consideration of pseudoknots. This would be crucial for even more efficient aptazyme design, as many ribozymes feature extensive tertiary interactions to promote their activity. Again, this could be bypassed by adding additional terms to the partition function $Z$, corresponding to inter-loop interactions, which would allow for additional constraints stabilizing or destabilizing tertiary interactions in the presence or absence of ligands. That said, we expanded upon existing computational concepts, generalizing them to all functional nucleic acids and developed an extensible, modular, and universal constraint based system, allowing for the design of tailor-made ligand gated nucleic acids, fit to any specific problem.