Team:Bielefeld-CeBiTec/Project/unnatural base pair/unnatural base pairs

Unnatural Base Pairs

Short Summary

The idea to expand the DNA by the incorporation of an unnatural base pair (UBP) was already born in 1962. Since then, much effort has been done to engineer UBPs that function as an orthogonal system to create a semi-synthetic DNA (xenogenic DNA or XNA). Besides hydrogen bonding, researchers also investigated UBPs with different chemical properties. Usage of an UBP creates several challenges like the adaption of the whole transcriptional and translational machinery. When dealing with a semisynthetic organism, additional tasks arise, e.g. the biosynthesis of the "base" as well as the synthesis of the corresponding nucleosides and nucleotides. The de novo synthesis as well as the salvage pathway of nucleotides is a very complex metabolism, which includes a lot of different enzymes. To obtain a fully autonomous semi-synthetic organism, the easiest path is the incorporation of UBPs that are similar to canonical nucleotides using hydrogen bonding. This brings up isoguanosine (isoG) and isocytosine (isoC) with conceivable biosynthesis pathways.

Background to the Unnatural Base Pairs (UBPs)

Figure 1: Unnatural bases.

All amino acids are encoded by codons, which are defined by three base pairs. This information is encoded in the genome of an organism and since the origin of life every natural genome has consisted of the two-base-pair genetic alphabet dA-dT (adenine-thymine) and dG-dC (cytosine-guanine). There are strong efforts to replace a canonical base pair or expand the genetic code by a third unnatural base pair (UBP) (Martinot and Benner, 2004; Jiang and Seela, 2010; Kwok, 2012; Zhang et al., 2017; Yamashige et al., 2012; Seela et al., 2005; Switzer et al., 1989; Yang et al., 2011).
So far, the modification of sugars and phosphates for nucleotides with important applications have been explored. First experiments with unnatural bases extended the nucleotide alphabet by replacing thymine with 5-chlorouracil in E. coli over a period of 25 weeks (Dunn and Smith, 1957; Marlière et al., 2011). But for an UBP, two modified nucleobases are needed. A. Rich discussed the extension of the DNA by two additional bases already in 1962 (Rich, 1962). An additional UBP can be interesting for physiochemical properties if the nucleobases can be site-specifically derivatized with linkers for chemical groups. Furthermore, the availability of an UBP in vivo would be a milestone in the field of synthetic biology. This would mean the creation of a semi-synthetic organism with altered storage capabilities for genetic information that leads to new and useful functions and applications (Malyshev and Romesberg, 2015).

UBPs with hydrogen bonding

As stated above, utilizing an UBP creates several challenges. First approaches focused on orthogonal pairing and realizing of in vitro replication. For this purpose, UBPs with complementary hydrogen bonding were explored. The labs of Rapport and Benner independently investigated the UBP disoG-disoC, which is constitutional isomer of dG-dC. Main problems concerning this UBP are deaminiation and tautomerization that lead to mispairing with natural bases, predominantly dT/U. Those problems resulted in further derivates of disoG-disoC, like the latest UBP dZ (6 6-amino-5-nitro-3-(1‘-β-D-2‘-deoxyribofuranosyl)-2(1H)-pyridone) / dP (2-amino-8-(1‘-β-D-2‘-deoxyribofuranosyl)-imidazol[1,2-α]-1,3,5-trizan-4(H)-one) from the Benner lab that showed high-fidelity amplification by PCR (Yang et al., 2010). A Taq DNA polymerase was modified to accept the new ATCGPZ-DNA, resulting in a retention rate of 98.9% (Laos et al., 2014; Chen et al., 2011). The six-nucleotide genetic alphabet gives rise to DNA with a B-form as well as an A-form, with the major groves being 1 Å wider than the natural GC pair (Georgiadis et al., 2015). Also transcription as well as reverse transcription and even translation was successfully performed in vitro (Bain et al., 1992; Leal et al., 2015). Another UBP based on complementary hydrogen bonding is ds-dy, which are analogs to purine and pyridine developed by Hirao in 2000. In vitro transcription and translation was achieved using this UBP but the derivate dz with lower mispairing rates were insufficiently recognized by DNA and RNA polymerases as a triphosphate (Hirao et al., 2002; Hirao et al., 2004).

Other UBPs

Besides hydrogen bonding, further research directed towards UBPs with metal-depending pairing, hydrophobic forces and ring stacking forces has been done (Malyshev and Romesberg, 2015). d5SICS – dMMO2 and d5SICS-dNaM are two promising candidates using hydrophobic interactions, which allowed transcription (Seo et al., 2009). The first demonstration in E. coli was based on two plasmids, one encoding the nucleoside triphosphate transporter for dNaM and d5SICS and the other encoding a gene sequence using the extended genetic code (Malyshev et al., 2014). Uptake of the synthetic bases as well as a stable plasmid replication over 24 generations was demonstrated (Malyshev et al., 2014). In 2017, the Romesberg group presented a new version of their semi-synthetic organism. The most important advances were an optimized transporter with improved uptake of unnatural triphosphates and better retention of XNA with dNaM-dTPT3. Furthermore, they used a CRISPR-Cas system to eliminate plasmids that lost the XNA (Zhang et al., 2017).

Our approach

The challenging part about using XNA is the need for synthetic or evolved proteins that allow for replication, transcription, and packaging of the XNA (Schmidt, 2010). For our approach to expand the genetic code, we decided on the UBP disoG-disoCm (5-methyl-isocytosine). The 5-methyl derivative shows more stability towards hydrolysis than isoC (Tor and Dervan, 1993). The disoCm-disoGTP system also has an improved behavior concerning the in vitro transcription with T7 RNA polymerase. The presence of the 5-methyl group possibly results in a better contact between the template and the polymerase (Tor and Dervan, 1993).
Another aspect is the similarity of the unnatural bases isoG und isoCm to the natural bases guanine and cytosine while being an orthogonal system at the same time. Due to the structural similarity, there is better chance for compatibility with interacting enzymes. In 1992 the Benner lab showed that the in vitro translation of mRNA containing disoC worked with a non-standard tRNA containing the purine complementary disoG inside the anticodon (Bain et al., 1992). Their cell free experiments showed a high specifity for the incorporation of a non-canonical amino acid by the ribosome using this unnatural base. With these stereoisomer of the natural bases it is more likely to achieve an optimized replication, transcription or translation with less adaption of the correspondent enzymes than with hydrophobic UBPs. On top of that, the hydrophobic UBPs are very expensive, because of their complex synthesis. Looking forward to create an autonomous synthetic organism it seems to be impossible to create a biosynthetic pathway for unnatural bases that differ a lot from natural bases. Whereas isoG is already known to be metabolic substance of the plant Croton tiglium. Revealing this metabolic pathway can make it usable for any synthetic organism and therefore stepping forward towards a fully autonomous synthetic organism.


Hirao, I., Kimoto, M., and Yamashige, R. (2012). Natural versus Artificial Creation of Base Pairs in DNA: Origin of Nucleobases from the Perspectives of Unnatural Base Pair Studies. Acc. Chem. Res. 45: 2055–2065.
Jiang D, Seela F. Oligonucleotide Duplexes and Multistrand Assemblies with 8-Aza-2′-deoxyisoguanosine: A Fluorescent isoG d Shape Mimic Expanding the Genetic Alphabet and Forming Ionophores. J Am Chem Soc. 2010;132:4016–24. doi:10.1021/ja910020n.
Kwok R. Chemical biology: DNA’s new alphabet. Nature. 2012;491:516–8. doi:10.1038/491516a.
Zhang Y, Lamb BM, Feldman AW, Zhou AX, Lavergne T, Li L, et al. A semisynthetic organism engineered for the stable expansion of the genetic alphabet. Proc Natl Acad Sci. 2017;114:1317–22.
Yamashige R, Kimoto M, Takezawa Y, Sato A, Mitsui T, Yokoyama S, et al. Highly specific unnatural base pair systems as a third base pair for PCR amplification. Nucleic Acids Res. 2012;40:2793–806.
Seela F, Peng X, Li H. Base-pairing, tautomerism, and mismatch discrimination of 7-halogenated 7-deaza-2́-deoxyisoguanosine: Oligonucleotide duplexes with parallel and antiparallel chain orientation. J Am Chem Soc. 2005;127:7739–51. doi:10.1021/ja0425785.
Switzer C, Moroney SE, Benner SA. Enzymatic incorporation of a new base pair into DNA and RNA. J Am Chem Soc. 1989;111:8322–3. doi:10.1021/ja00203a067.
Yang Z, Chen F, Alvarado JB, Benner SA. Amplification, mutation, and sequencing of a six-letter synthetic genetic system. J Am Chem Soc. 2011;133:15105–12. doi:10.1021/ja204910n.
Dunn DB, Smith JD. Effects of 5-halogenated uracils on the growth of Escherichia coli and their incorporation into deoxyribonucleic acids. Biochem J. 1957;67:494–506. doi:10.1042/bj0670494.
Marlière P, Patrouix J, Döring V, Herdewijn P, Tricot S, Cruveiller S, et al. Chemical evolution of a bacterium’s genome. Angew Chemie - Int Ed. 2011;50:7109–14.
Rich, A. (1962). On the problems of evolution and biochemical information transfer. Horizons Biochem.: 103–126.
Malyshev, D.A. and Romesberg, F.E. (2015). The expanded genetic alphabet. Angew. Chem. Int. Ed. Engl. 54: 11930–44.
Ma, R., Yang, Z., Huang, L., Zhu, X., Kai, L., Cai, J., Wang, X., and Xu, Z. (2010). Construction of an efficient Escherichia coli cell-free system for in vitro expression of several kinds of proteins. Eng. Life Sci. 10: 333–338.
Laos, R., Thomson, J.M., and Benner, S.A. (2014). DNA polymerases engineered by directed evolution to incorporate non-standard nucleotides. Front. Microbiol. 5: 1–14.
Chen, F., Yang, Z., Yan, M., Alvarado, J.B., Wang, G., and Benner, S.A. (2011). Recognition of an expanded genetic alphabet by type-II restriction endonucleases and their application to analyze polymerase fidelity. Nucleic Acids Res. 39: 3949–3961.
Georgiadis, M.M., Singh, I., Kellett, W.F., Hoshika, S., Benner, S.A., and Richards, N.G.J.(2015). Structural Basis for a Six Nucleotide Genetic Alphabet. J. Am. Chem. Soc. 137: 6947–6955.
Bain, J.D., Switzer, C., Chamberlin, R., and Bennert, S.A. (1992). Ribosome-mediated incorporation of a non-standard amino acid into a peptide through expansion of the genetic code. Nature 356: 537–539.
Leal, N. a, Kim, H., Hoshika, S., Kim, M., Carrigan, M. a, and Benner, S. A.(2015). Transcription, Reverse Transcription, and Analysis of RNA Containing Arti fi cial Genetic Components. ACS Nano 4: 407.
Seo, Y.J., Matsuda, S., and Romesberg, F.E. (2009). Transcription of an Expanded Genetic Alphabet. J. Am. Chem. Soc. 131: 5046–5047.
Malyshev, D.A., Dhami, K., Lavergne, T., Chen, T., Dai, N., Foster, J.M., Corrêa, I.R., and Romesberg, F.E. (2014). A semi-synthetic organism with an expanded genetic alphabet. Nature 509: 385–8.
Zhang, Y., Lamb, B.M., Feldman, A.W., Zhou, A.X., Lavergne, T., Li, L., and Romesberg, F.E. (2017). A semisynthetic organism engineered for the stable expansion of the genetic alphabet. Proc. Natl. Acad. Sci. 114: 1317–1322.
Schmidt, M. (2010). Xenobiology: A new form of life as the ultimate biosafety tool. BioEssays 32: 322–331.
Tor, Y. and Dervan, P. (1993). Site-specific enzymic incorporation of an unnatural base, N6-(6-aminohexyl) isoguanosine, into RNA. J. Am. Chem. Soc. 115: 4461–4467.


Loading ...