(11 intermediate revisions by 3 users not shown) | |||
Line 82: | Line 82: | ||
# CascAID V1.0 # | # CascAID V1.0 # | ||
# # | # # | ||
− | # | + | # Thu Nov 2 04:23:54 2017 # |
# # | # # | ||
# IGEM Munich 2017 # | # IGEM Munich 2017 # | ||
Line 97: | Line 97: | ||
CascAID is a potentially universal tool for nucleic acid detection. | CascAID is a potentially universal tool for nucleic acid detection. | ||
Fast adaptation of our platform to new targets requires <i>in silico</i> verification of the crRNA design. | Fast adaptation of our platform to new targets requires <i>in silico</i> verification of the crRNA design. | ||
− | Crucial factors for the development of these crRNA designs are the binding of the crRNA to Cas13a | + | Crucial factors for the development of these crRNA designs are the binding of the crRNA to Cas13a, which is |
− | mainly determined by its secondary structure and the uniqueness of the targeting sequence in the transcriptome | + | mainly determined by its secondary structure, and the uniqueness of the targeting sequence in the transcriptome (to rule out false positive results). To ensure the integrity of the Cas13a-crRNA complex, we developed |
− | to rule out false positive results. To ensure the integrity of the Cas13a-crRNA complex, we developed | + | a python script that uses the established program packages for secondary structures, NUPACK and Mfold. |
− | a python script that uses the established program packages for secondary structures NUPACK and Mfold. | + | |
In order to verify the specificity of the targeting sequence, we used the BLASTN-short program to | In order to verify the specificity of the targeting sequence, we used the BLASTN-short program to | ||
check for similar structures in a transcriptome databank. Additionally, we created a database of crRNA designs | check for similar structures in a transcriptome databank. Additionally, we created a database of crRNA designs | ||
Line 106: | Line 105: | ||
as extensive as possible given the limited time, checking for collaboration with other teams working with Cas13a, | as extensive as possible given the limited time, checking for collaboration with other teams working with Cas13a, | ||
mainly TU Delft. | mainly TU Delft. | ||
− | The second branch of software | + | The second branch of software we developed is needed for hardware control in our project. |
They allow user's devices such as computers and smartphones to control | They allow user's devices such as computers and smartphones to control | ||
− | our hardware | + | our hardware, Heatbringer and Lightbringer. |
The repository to our software can be found <a class="myLink" href="https://github.com/igemsoftware2017/igem_munich_2017">here</a>. | The repository to our software can be found <a class="myLink" href="https://github.com/igemsoftware2017/igem_munich_2017">here</a>. | ||
</p> | </p> | ||
Line 124: | Line 123: | ||
There are two main problems regarding the design of crRNA for a diagnostic test. | There are two main problems regarding the design of crRNA for a diagnostic test. | ||
First, the secondary structure of the crRNA needed for Cas13a activity needs to be verified. | First, the secondary structure of the crRNA needed for Cas13a activity needs to be verified. | ||
− | + | Secondly, the sequence targeted by the crRNA has to be specific, i.e. there must be no identical sequence in the | |
− | reference transcriptome of an healthy patient. Otherwise off-target effects will lead to | + | reference transcriptome of an healthy patient. Otherwise, off-target effects will lead to |
false positive results since Cas13a is activated even though the pathogen is not present. | false positive results since Cas13a is activated even though the pathogen is not present. | ||
To address these issues, we developed a software relying on bioinformatic principles such as | To address these issues, we developed a software relying on bioinformatic principles such as | ||
− | secondary structure prediction and Basic Local Alignment | + | secondary structure prediction and Basic Local Alignment Search Tool (BLAST). |
</p> | </p> | ||
</td> | </td> | ||
Line 147: | Line 146: | ||
crystallography data of crRNA in complex with Cas13a, or from structure prediction data of experimentally | crystallography data of crRNA in complex with Cas13a, or from structure prediction data of experimentally | ||
tested crRNAs. Using secondary structure verification we were able to rule out misfolding crRNA | tested crRNAs. Using secondary structure verification we were able to rule out misfolding crRNA | ||
− | designs prior to experiment. We developed a script for the end user | + | designs prior to experiment. We developed a script for the end user automating this procedure. |
</p> | </p> | ||
</td> | </td> | ||
Line 156: | Line 155: | ||
<br> | <br> | ||
<p> | <p> | ||
− | NUPACK is a RNA Secondary Structure Prediction | + | NUPACK is a RNA Secondary Structure Prediction package developed |
by several contributors under the guidance of Prof. Niles A. Pierce at the California Insitute of Technology (Caltech). | by several contributors under the guidance of Prof. Niles A. Pierce at the California Insitute of Technology (Caltech). | ||
The source-code is available free-of-charge for academic usage. | The source-code is available free-of-charge for academic usage. | ||
NUPACK allows the analysis of the partition function, the minimum free energy and the equillibrium base-pairing | NUPACK allows the analysis of the partition function, the minimum free energy and the equillibrium base-pairing | ||
probabilities of a RNA sequence. | probabilities of a RNA sequence. | ||
− | For offline usage we implemented NUPACK locally. We proceeded to implement Mfold as a webserver request. | + | For offline usage, we implemented NUPACK locally. We proceeded to implement Mfold as a webserver request. |
This decision was made because we experienced that in certain cases, only one of the program packages | This decision was made because we experienced that in certain cases, only one of the program packages | ||
was able to predict the secondary structure of crRNA as described in previous papers, predominantly the paper of Liu et al. published in <i>Cell</i> in 2017 | was able to predict the secondary structure of crRNA as described in previous papers, predominantly the paper of Liu et al. published in <i>Cell</i> in 2017 | ||
Line 170: | Line 169: | ||
Furthermore, we experienced that NUPACK sometimes predicts the right secondary structure, it just doesn't represent | Furthermore, we experienced that NUPACK sometimes predicts the right secondary structure, it just doesn't represent | ||
the most stable structure. With NUPACK's subopt, it is possible to predict more than just | the most stable structure. With NUPACK's subopt, it is possible to predict more than just | ||
− | the most stable structure. This enables looking at less stable structures | + | the most stable structure. This enables looking at less stable structures which might be more favourable when bound to the protein and comparing these to the |
− | + | ||
structure databank. The output of a suboptimal prediction | structure databank. The output of a suboptimal prediction | ||
− | is given | + | is given below as the second example. Explanations are included as comments after '#': |
Line 184: | Line 182: | ||
24 49 # form basepairs | 24 49 # form basepairs | ||
25 48 # this would mean base 22 | 25 48 # this would mean base 22 | ||
− | 26 47 # pairs with base | + | 26 47 # pairs with base 51 |
27 46 | 27 46 | ||
28 45 | 28 45 | ||
Line 228: | Line 226: | ||
<p> | <p> | ||
Mfold is a webserver for RNA secondary structure prediction developed by Michael Zuker based on his paper | Mfold is a webserver for RNA secondary structure prediction developed by Michael Zuker based on his paper | ||
− | "Mfold web server for nucleic acid folding and hybridization prediction" that published in <i>Nucleic Acids Research</i> | + | "Mfold web server for nucleic acid folding and hybridization prediction" that was published in <i>Nucleic Acids Research</i> |
in 2003. Since Mfold is not available as a locally buildable binary for every operating system, we developed a | in 2003. Since Mfold is not available as a locally buildable binary for every operating system, we developed a | ||
script that automatically requests a standardised RNA Fold job from the server, therefore making it available | script that automatically requests a standardised RNA Fold job from the server, therefore making it available | ||
throughout all operating systems. Using the result obtained from this request, the secondary structure is | throughout all operating systems. Using the result obtained from this request, the secondary structure is | ||
checked via a string comparison in so-called "Vienna" notation. This notation gives base pairing as a string | checked via a string comparison in so-called "Vienna" notation. This notation gives base pairing as a string | ||
− | of dots and brackets where a dot represents a non-bonded base and brackets | + | of dots and brackets where a dot represents a non-bonded base and brackets represents paired bases, clarified by |
− | a opening bracket "(" at the 5'-end | + | a opening bracket "(" at the 5'-end and a closing bracket ")" at the 3'-end of each paired sequence. An example for the output |
of the program is given below: | of the program is given below: | ||
<pre style="text-align: left;"> | <pre style="text-align: left;"> | ||
Line 289: | Line 287: | ||
</p> | </p> | ||
<p> | <p> | ||
− | This is also a good example to show that | + | This is also a good example to show that it might happen that one program recognizes the |
crRNA secondary structure while the other does not. In this case, NUPACK has predicted the structure | crRNA secondary structure while the other does not. In this case, NUPACK has predicted the structure | ||
while Mfold is not able to predict the structure. Even though this is an experimental construct | while Mfold is not able to predict the structure. Even though this is an experimental construct | ||
Line 308: | Line 306: | ||
In order to rule out off-target effects for the designed crRNA in diagnostic applications, | In order to rule out off-target effects for the designed crRNA in diagnostic applications, | ||
we developed a script that is able to BLAST the sequence either against whole databases | we developed a script that is able to BLAST the sequence either against whole databases | ||
− | online or a | + | online or a custom database we compiled. This database contains the human transcriptome and those of bacteria common in the human nasal tract as well as modell organisms used in our project: |
− | + | ||
<ol style="list-style-type:disc; list-style-position:left; text-align: left;"> | <ol style="list-style-type:disc; list-style-position:left; text-align: left;"> | ||
<li>Homo Sapiens</li> | <li>Homo Sapiens</li> | ||
Line 321: | Line 318: | ||
</p> | </p> | ||
<p> | <p> | ||
− | Transcriptomes that | + | Transcriptomes that are common in the nasal tract but were not available are, |
− | + | among others: | |
</p> | </p> | ||
<ol style="list-style-type:disc; list-style-position:left; text-align: left;"> | <ol style="list-style-type:disc; list-style-position:left; text-align: left;"> | ||
Line 331: | Line 328: | ||
<br> | <br> | ||
<p> | <p> | ||
− | All data was retreived | + | All data was retreived from the Transcriptome Release #90 of the ENSEMBL project. The output is generated |
− | from the output of a blastn-short run and consists of all sequences that show sequence identity of 18 bp or | + | from the output of a blastn-short run and consists in the example below of all sequences that show sequence identity of 18 bp or |
− | higher. | + | higher. For an actual run, the identity would need to be 26 bp or higher in order to actually show off-target effects since Cas13a is |
− | + | selective up to 2 point mutations regarding the binding of crRNA and subsequent RNase activity. | |
− | selective up to 2 point mutations regarding the binding of crRNA and subsequent | + | The expectation value here describes the number of hits one can expect to find |
− | The expectation value here describes the | + | in a random database the same size as the database used for the blastn-short run. |
− | in a random database the size | + | |
</p> | </p> | ||
</p> | </p> | ||
Line 405: | Line 401: | ||
<p> | <p> | ||
The database program gives you an interface to interact with the MySQL database created for | The database program gives you an interface to interact with the MySQL database created for | ||
− | crRNAs that have been shown | + | crRNAs that have been shown to work experimentally . |
</p> | </p> | ||
<pre style="text-align: left;"> | <pre style="text-align: left;"> | ||
Line 468: | Line 464: | ||
<p> | <p> | ||
<ol style="text-align: left"> | <ol style="text-align: left"> | ||
− | <li id="ref_1"> | + | <li id="ref_1">M. Dirks, J. S. Bois, J. M. Schaeffer, E. Winfree, and N. A. Pierce. |
"Thermodynamic analysis of interacting nucleic acid strands."(2007) <i>SIAM Rev</i>, 49:65-88.</li> | "Thermodynamic analysis of interacting nucleic acid strands."(2007) <i>SIAM Rev</i>, 49:65-88.</li> | ||
<li id="ref_2">R. M. Dirks and N. A. Pierce. "An algorithm for computing nucleic acid base-pairing probabilities including pseudoknots." | <li id="ref_2">R. M. Dirks and N. A. Pierce. "An algorithm for computing nucleic acid base-pairing probabilities including pseudoknots." | ||
Line 474: | Line 470: | ||
<li id="ref_3">R. M. Dirks and N. A. Pierce. "A partition function algorithm for nucleic acid secondary structure including pseudoknots." | <li id="ref_3">R. M. Dirks and N. A. Pierce. "A partition function algorithm for nucleic acid secondary structure including pseudoknots." | ||
(2003) <i>J Comput Chem</i>, 24:1664-1677.</li> | (2003) <i>J Comput Chem</i>, 24:1664-1677.</li> | ||
− | <li id="ref_4">M. Zuker, D. H. Mathews | + | <li id="ref_4">M. Zuker, D. H. Mathews and D. H. Turner. "Algorithms and Thermodynamics for RNA Secondary Structure Prediction: A Practical Guide" |
(1999) <i>RNA Biochemistry and Biotechnology</i> 11-43 J. Barciszewski and B. F. C. Clark, eds., | (1999) <i>RNA Biochemistry and Biotechnology</i> 11-43 J. Barciszewski and B. F. C. Clark, eds., | ||
NATO ASI Series, Kluwer Academic Publishers, Dordrecht, NL </li> | NATO ASI Series, Kluwer Academic Publishers, Dordrecht, NL </li> | ||
− | <li id="ref_5">J.-M. Rouillard, M. Zuker | + | <li id="ref_5">J.-M. Rouillard, M. Zuker and E. Gulari. "OligoArray 2.0: Thermodynamicaly improved |
oligonucleotide design for microarrays." (2003) <i>Nucleic Acids Res.</i> 31:12, 3057-3062. </li> | oligonucleotide design for microarrays." (2003) <i>Nucleic Acids Res.</i> 31:12, 3057-3062. </li> | ||
− | <li>S.F. Altschul, W. Gish, W. Miller, E.W. Myers | + | <li>S.F. Altschul, W. Gish, W. Miller, E.W. Myers and D.J. Lipman "Basic local alignment search tool." (1990) |
− | <i>J. Mol. Biol. </i> 215:403-410</li> | + | <i>J. Mol. Biol. </i> 215:403-410.</li> |
Latest revision as of 01:47, 2 November 2017
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|