Team:Bielefeld-CeBiTec/Project/unnatural base pair/uptake and biosynthesis

Uptake and Biosynthesis of iso-CmTP and iso-GTP

Short Summary

A crucial aspect in the development of semisynthetic organisms is the supply with unnatural nucleoside triphosphates Two main strategies exist to provide Escherichia coli with unnatural nucleoside triphosphates. The first strategy is based on the uptake of nucleoside triphosphates from the media, which requires the expression of a heterologous transporter, facilitating the uptake across the inner membrane. The second strategy is based on the in vivo biosynthesis of the desired nucleoside triphosphates, which can be achieved by integration of heterologous or synthetic pathways. For our project, we pursuit both paths, analyzing uptake of unnatural nucleoside triphosphates by different variants of a nucleotide transporter and investigating the biosynthesis of isoguanosine using a pathway from the plant Croton tiglium.

Strategies to Supply the Cell with iso-CmTP and iso-GTP

The first essential step to ensure that unnatural base pairs are retained over a long period of time is to provide the cell with the unnatural bases required for the replication of the DNA containing the unnatural base pair. There are two different strategies for it. The first one is to supplement the media with the (deoxy-)nucleoside triphosphates of the desired unnatural bases. To allow uptake of the nucleoside triphosphates, the cell needs to possess a transporter that can facilitate a sufficient flux into the cell. While this strategy seems to be the easiest, it also has some disadvantages. First, E. coli does not possess nucleoside triphosphate transporters suitable for the transport of unnatural nucleoside triphosphates. Given that nucleoside triphosphates such as ATP are absent in the extracellular environment, such transporters have not evolved in E. coli. Another reason for the absence of nucleoside triphosphate transporters from the cell membrane is that the existence of such transporters could potentially put the intracellular ATP pool at risk. Therefore, a heterologous transporter has to be introduced. Secondly, chemically synthesized nucleotides are very expensive, so feeding them to the cells would make bioprocesses uneconomically for scale up. On the other hand, this strategy allows the testing of highly unique unnatural bases if uptake can be ensured.

The second strategy is based on the in vivo biosynthesis of the desired unnatural bases. This strategy would allow scaling up of processes without making them uneconomical and would eliminate the need for a heterologous transport system. But the biosynthesis of fully synthetic bases is, if possible at all, extremely challenging. Furthermore, to allow the incorporation of those bases into the DNA, phosphorylation of the nucleosides has to be ensured which is facilitated by in part highly specific kinases. Therefore, this strategy represents an optimal system which as of now is extremely challenging to implement.

Figure 1: Two strategies for making iso-CmTP and iso-GTP available to the cell.
The first strategy (left) is based on a heterologous transporter which can facilitate the transport of iso-CmTP and iso-GTP. The media is supplemented with the unnatural (deoxy)nucleoside triphosphates, which are then transported into the cell and incorporated into the DNA. The second strategy (right) is more complex and is based on the de novo synthesis of the unnatural (deoxy)nucleoside triphosphates. Therefore, existing pathways from natural sources or newly designed pathways have to be introduced to the cell. The nucleoside triphosphates are then incorporated into the DNA.

The Nucleotide Transporter from Phaeodactylum tricornutum PtNTT2

Given the structural similarity to the natural bases, iso-dCm and iso-dG might be taken up through the existing nucleoside transporters in sufficient amounts. But phosphorylation of the nucleosides remains a challenge. Therefore, the introduction of a heterologous nucleoside triphosphate transporter into the desired strain represents a promising alternative for the direct uptake of iso-dCmTP and iso-dGTP from the media.

The algae Phaeodactylum tricornutum, a diatom of the genus Phaedactylum, features six putative nucleotide transporters (NTTs). Two isoforms of these NTTs have been characterized by Ast and coworkers and it was shown that both isoforms facilitate transport across the plastid membrane. While isoform 1 (NTT1) acts as a proton-dependent adenine nucleotide importer, NTT2 facilitates the counter exchange of (deoxy-)nucleoside triphosphates (Ast et al., 2009).

Figure 2: Uptake of α32-labeled nucleotides by the two isoforms PtNTT1 and PtNTT2 when expressed in E. coli (Ast et al., 2009)
The uptake of α32-labeled nucleotides was measured in E. coli. Isoform 1 (A) was shown to transport adenosine mono-, di- and triphosphates, while isoform 2 (B) shows a specificity for nucleoside triphosphates.

Ast and colleagues demonstrated the subcellular localization by tagging the transporters with GFP. Figure 3 shows the subcellular localization of the two isoforms PtNTT1 and PtNTT2. The fluorescence signal can only be detected within the cell around the plastid and not within the cell membranes. The isoform 2 of the nucleotide transporter was shown to be an unspecific (deoxy-)nucleoside transporter, facilitating the uptake of CTP, GTP, dCTP, ATP, UTP, dGTP, dATP and TTP when expressed in E. coli, as shown in Figure 2 (Ast et al., 2009). Uptake was measured using α32-labeled nucleotides. While isoform 1 (Figure 2A) can transport adenosine mono-, di- and triphosphates, isoform 2 (Figure 2B) shows a specificity for nucleoside triphosphates. The fact that PtNTT2 can accept a broad range of different nucleotides makes the transporter interesting for the transport of unnatural nucleotides, as long as they are provided as nucleoside triphosphates.

Figure 3: Subcellular localization of PtNTT1 and PtNTT2 in Phaeodactylum tricornutum (Ast et al., 2009). PtNTT1 and PtNTT2 were fused to GFP to study the subcellular localization.

Zhang et al. 2017 integrated PtNTT2 into the chromosome of E. coli BL21(DE3) under control of the lacUV5 promoter. To demonstrate its feasibility for the uptake of nucleotides, uptake of [α-32P]-dATP was measured. The native sequence of PtNTT2 features an N-terminal signal sequence directing the subcellular localization to the plastid membrane. In E. coli, this signal sequence is likely to be retained, leading to a growth defect in cells expressing the native PtNTT2 transporter. Therefore, a truncated version of PtNTT2, PtNTT2(66-575), was used. The chromosomally integrated, truncated, and codon optimized PtNTT2 (65-575) under control of PlacUV5 was shown to be an optimal compromise between efficient uptake and the growth limitation resulting from expression of the heterologous protein (Zhang et al., 2017).

Figure 4: Uptake of α32labeled ATP by the different versions of PtNTT2 (Zhang et al., 2017).
The expression of PtNTT2 was investigated in different strains, under control of different promotors, and plasmid-bound as well as integrated into the chromosome. In their final design, Zhang and colleagues integrated PtNTT2 chromosomally in E.coli BL21(DE3) under control of the lacUV5 promoter

For our experiments, we decided to use PtNTT2 to facilitate the transport of unnatural nucleoside triphosphates into the cell. Therefore, our plan is to construct several versions of PtNTT2 under control of the lacUV5 promoter, including the native sequence PtNTT2(1-575), two truncated versions PtNTT2(31-575) and PtNTT2(66-575), as well as two versions with heterologous signal peptides, PtNTT2-Tat and PtNTT2-pelB. While PtNTT2-Tat features a signal peptide targeting the twin-arginine translocation pathway, PtNTT2-pelB features the signal peptide of the pectate lyase B of Erwinia carotovora, which targets the general secretion pathway. By characterizing all of these versions we hope to identify the best suited version for our purpose, the efficient uptake of unnatural nucleoside triphosphates.

Sec- and Tat Secretion Pathways

PtNTT2 comes from an eukaryote and is integrated into the plastid membrane in its native algael host. Therefore, expression of PtNTT2 in E. coli might not lead to efficient integration into the inner membrane. Therefore, we wanted to test different signal peptides to find the one best suited for efficient integration of PtNTT2 into the inner membrane. In E. coli , proteins can be secreted via two main pathways, namely the general secretion (Sec) and the twin-arginine translocation (Tat) pathway . Proteins which are intended to be secreted are targeted by signal peptides which are located at the N-terminus of a pre-protein and share one common tripartite structure: a positively charged N-terminal domain (N-domain) followed by a nonpolar hydrophobic domain (H-domain) and the more polar C-domain (De Keyzer et al., 2003; von Heijne, 1998). Signal peptides vary in length, but are usually between 18 and 30 amino acids long. During the translocation process, the signal peptide is cleaved from the pre-protein by a signal peptide peptidase, which recognizes a cleavage site within the C-domain of the signal peptide (De Keyzer et al., 2003). Signal peptides targeting the Sec pathway do not feature a common consensus sequence or motif (Caspers et al., 2010). However, the binding of the secretion specific chaperone SecB or the signal recognition particle SRP (which represents the first step of the Sec pathway) to the signal peptide is influenced by the hydrophobicity of the H-domain (De Keyzer et al., 2003). Signal peptides targeting the Tat pathway are usually around 30 amino acids long (Berks et al., 2014). On the contrary to Sec pathway targeting signal peptides, Tat pathway targeting signal peptides do feature a consensus sequence. The crucial element for Tat recognition is a twin-arginine motif (RR) located at the N/H-domain junction. In bacteria, the RR is usually followed by phenylalanine, leucine and lysine at the RR+2 position with especially the phenylalanine being of high importance (Cline, 2015).

Sec Pathway

In bacteria, most proteins are secreted via the conserved and essential Sec pathway in an unfolded state (Schneewind and Missiakas, 2014; Caspers and Freudl, 2008). The key component of the Sec pathway is the translocase which is located in the plasma membrane and composed of the SecYEG heterotrimeric protein as well as the SecA ATPase. SecYEG consists of three components, SecY, SecE and SecG. SecY and SecE form the core of the transmembrane channel, while SecG is an additional integral subunit that seems to be not essential for the translocation process (Schneewind and Missiakas, 2014; de Keyzer et al., 2003; Caspers and Freudl, 2008). SecA serves as the molecular motor of the translocation process by consecutively binding and hydrolyzing ATP and hence initializing the translocation. The proton motive force (PMF) helps translocating the protein once the process is initialized by SecA (Caspers and Freudl, 2008). SecYEG can associate with another heterotrimeric complex involved in the translocation process consisting of SecD, SecF, and YajC. The exact role of this additional heterotrimeric complex is yet unknown but seems to optimize secretion (de Keyzer et al., 2003). It was proposed that SecDFYajC might be involved in the proton motive force dependent steps of the translocation process (Denks et al., 2014). Proteins are targeted in two different ways, being mediated by the chaperone SecB, or the signal recognition particle (SRP) . SecB and SRP both bind the nascent pre-protein but seem to bind pre-proteins of different lengths. Proteins that are to be secreted are usually targeted via the SecB mechanism (de Keyzer et al., 2003). No matter which mechanism is involved in the targeting process, the proteins eventually interact with the translocase. Gradual translocation of the pre-protein is catalyzed by the translocase, during which the pre-protein is further processed by the signal peptide peptidase (SPase). The SPase is embedded in the periplasmic membrane and removes the signal peptide of the pre-protein (Denks et al., 2014; de Keyzer et al., 2003).

Figure 5: Schematic overview over the Sec-pathway and the involved components.(de Keyzer et al., 2003). Pre-proteins are either targeted by SRP or SecB and translocated by the translocase. The translocase consists of SecYEG, forming the transmembrane channel, and SecA, the ATP-driven molecular motor of the translocation. The signal peptide is eventually removed by the SPase (here Lep) .

Tat Pathway

Opposite to the Sec pathway, which secretes proteins while they are not fully folded, proteins secreted via the twin-arginine translocase (Tat) pathway are fully folded (Cline, 2015). The Tat pathway is an active transport system with the transmembrane proton motive force being the driving force of the translocation (Palmer and Berks, 2012). Three proteins from only two structural families form the machinery required for Tat mediated secretion (Berks et al., 2014). The first family called TatA is described as a single transmembrane helix with a short N-terminal transmembrane domain and an unstructured C-tail (Berks et al., 2014; Cline, 2015). TatB belongs to the TatA family but was shown to be abdicable for the basic functionality. The second structural family involved in the Tat pathway is TatC, consisting of six transmembrane helices with the N- and C-termini being exposed to the cytoplasm. The structure of TatC was determined for Aquifex aeolicus and is shown together with the structure of TatA from E. coli in Figure 6 (Berks et al., 2014; Cline, 2015). A fourth component, TatE provides an overlapping function to TatA, hence making TatE dispensable (Kikuchi et al., 2009).

Figure 6: Structures of TatA from E. coli and TatC from Aquifex aeolicus
While TatA forms a single transmembrane helix, TatC consists of six transmembrane helices (Berks et al., 2014).

TatB and TatC form the receptor complex, which is composed of up to eight TatBC units (Cline, 2015). The signal peptide is bound by this receptor complex, which triggers the PMF dependent assembly of TatA to the translocation pore (Palmer and Berks, 2012; Cline, 2015). The whole complex is referred to as translocase. After the assembly, the substrate is translocated and the signal peptide is cleaved by a signal peptide peptidase as shown in Figure 7. Many details of the Tat pathway are not known yet or uncertain, including how TatA disassembles (Cline, 2015).

Figure 7: Steps of the Tat translocation system
After the signal peptide is bound by TatBC TatA assembles and forms the translocation pore. Using the transmembrane proton motive force, the substrate is translocated and the signal peptide cleaved by a signal peptidase (Palmer and Berks 2012)

Croton Tiglium

Origin of Croton Tiglium

Croton tiglium (L.), also commonly known as the „Croton oil plant“ is a plant of the family of Euphorbiaceae that was first described in 1753 by Carl Linnaeus. Its origin lays in the south-eastern asian countries such as India or Thailand, where it can grow up to 7m tall. In nature, it occurs in many habitats ranging from 300-1500m height as well as from shrublands to forests. The plant we used was kindly provided by the botanic garden of the Philipps University in Marburg. Seeds of this plant were originally given to the botanical garden Marburg in 1986. They were then provided by the botanical garden Giessen.

Usages of Croton Tiglium

Besides the fact that all components of croton tiglium are poisonous, it is widely used as an herb of the traditional chinese medicine since 2000 years. There, it found its usage as an oil to treat skin diseases and intoxications and as a laxative. However, many of these former usages were abandoned becuase of the high toxity. Further, the plant was used against cancerous diseases due to its anti-tumoric effects that were proven in 1994 (Kim et al).

Secondary Metabolism of C. Tiglium

In difference to animals, plants usually have a second metabolism that does not interact with any chemicals needed for primary functions like growth or development. However, the seondary metabolism is still connected to the primary metabolism as it uses some of the substances produces within. The products that emerge from the second metabolism offer a wide range of possible usages for humanity due to their great diversity. However, the pants mostly use them as protection against herbivores and pathogens. Some metabolites are toxic whereas others do attract parasites and predators as a protection against the herbivores. Besides from many other secondary metabolites, croton tiglium produces iso-guanosine. In 1932, iso-guanosine was first isolated from croton tiglum (Cherbuliez et al 1932). Today, it also commercially available as crotonoside. Iso-guanosine was object of many reaearches and is found to have a lot of effects onto biological processes (see Lowry and Brown, 1952; Huang et al 1973; Vasu and David ,1985 for reference) including antitumorous effects (Kim et al 1942) For us, one of the main aspects that is interesting on iso-guanosine is its use as an unnatural base within organisms.

De novo Synthesis of Purine Bases

Figure 1: De novo Synthesis of Pyrimidine Bases

Purine bases are produced de novo directly on the ribose (Berg et al., 2012). The synthesis starts with the replacement of the pyrophosphate of PRPP with an amino group, yielding phosphoribosylamine (PRA). This reaction is catalyzed by amidophosphoribosyltransferase (ATase) and also uses the ammonia from a glutamine side-chain as the donor of the amino group. The conversion of PRPP to PRA is a committing step in the purine biosynthesis. The synthesis of the purine ring involves nine additional steps with the first six reactions being relatively similar. In every reaction, an oxygen atom which is bound to a carbon atom is activated by phosphorylation and a subsequent substitution by ammonia or an amino-group, which act as a nucleophile agent. These subsequent reactions lead to the formation of inosinate (IMP), which acts as a key intermediate in the purine synthesis. Inosinate is converted into either AMP or GMP. AMP is synthesized by a substitution of the C-6 carbonyl oxygen with an amino group by adenylosuccinate synthase (ASS). In this reaction, GTP instead of ATP is used as a donor of the phosphoryl group. The conversion of IMP to GMP is catalyzed by the GMP synthase and starts with the oxidation of IMP to xanthylate (XMP) and the subsequent addition of an amino group. In a second step, XMP is converted into GMP, a reaction that requires ATP as a donor for an AMP group. GMP and AMP are again phosphorylated to GTP and ATP by specific kinases.

Conversion of Ribonucleosid Diphosphates to Deoxyribonucleotides

Figure 1: De novo Synthesis of Purine Bases

Deoxyribonucleotides are synthesized from ribonucleotides by substitution of the 2´-hydroxyl group of the ribose by a hydrogen. The reaction is catalyzed by the enzyme ribonucleotide reductase, which is strongly conserved in all living organisms (Berg et al., 2012). In E. coli, two main types of ribonucleotide reductases exist. Ribonucleoside-triphosphate reductases can convert ribonucleoside-triphosphates into deoxyribonucleoside-triphosphates, while ribonucleoside-diphosphate reductases convert ribonucleoside-diphosphates to deoxyribonucleoside-diphosphates (Kanehisa and Goto, 2000).

Salvage Pathways

Both purine and pyrimidine bases can be recycled and converted into the corresponding nucleotides through salvage pathways. Adenine can be recycled through conversion into AMP, a reaction that is catalyzed by the adenine phosphoribosyltransferase and requires PRPP . AMP can then be subsequently converted into ATP or dATP as described above. Hypoxanthine guanine phosphoribosyltransferase (HGPRT) catalyzes the recycling of guanosine, a reaction that also requires PRPP as a donor for a phosphate. HGPRT also catalyzes the conversion of hypoxanthine to IMP which again is a precursor of GMP and AMP. The recycling of thymine involves two steps: in the first step, thymine is converted to thymidine by the thymidine phosphorylase. In a second step, thymidine is converted to TMP by thymidine kinase. Cytosine can be recycled by conversion to uracil, a reaction that is catalyzed by cytosine deaminase. Following the conversion to UTP, CTP is produced by CTP synthase. The recycling of bases saves intracellular energy, since the de novo synthesis requires large amounts of ATP. Therefore, the recycling of bases through salvage pathways is usually favored by cells.

Adaptions of the purine pathway used in C. Tiglium for the production of iso-Guanosine

The ordinary form of Guanosine is produced within the purine pathway. Thus, it is likely that the production of iso-guanosine is realised inbetween these reactions.


Ast, M., Gruber, A., Schmitz-Esser, S., Neuhaus, H.E., Kroth, P.G., Horn, M., and Haferkamp, I. (2009). Diatom plastids depend on nucleotide import from the cytosol. Proc. Natl. Acad. Sci. U. S. A. 106: 3621–3626.
Berks, B.C., Lea, S.M., and Stansfeld, P.J. (2014). Structural biology of Tat protein transport. Curr. Opin. Struct. Biol. 27: 32–7.
Caspers, M., Brockmeier, U., Degering, C., Eggert, T., and Freudl, R. (2010). Improvement of Sec-dependent secretion of a heterologous model protein in Bacillus subtilis by saturation mutagenesis of the N-domain of the AmyE signal peptide. Appl. Microbiol. Biotechnol. 86: 1877–1885.
Caspers, M. and Freudl, R. (2008). Corynebacterium glutamicum possesses two secA homologous genes that are essential for viability. Arch. Microbiol. 189: 605–610.
Cline, K. (2015). Mechanistic Aspects of Folded Protein Transport by the Twin Arginine Translocase (Tat). J. Biol. Chem. 290: 16530–16538.
Denks, K., Vogt, A., Sachelaru, I., Petriman, N., Kudva, R., and Koch, H. (2014). The Sec translocon mediated protein transport in prokaryotes and eukaryotes. Mol. Membr. Biol. 31: 58–84.
von Heijne, G. (1998). Life and death of a signal peptide. Nature 396: 111, 113.
de Keyzer, J., van der Does, C., and Driessen, A.J.M. (2003). The bacterial translocase: a dynamic protein channel complex. Cell. Mol. Life Sci. 60: 2034–2052.
Kikuchi, Y., Itaya, H., Date, M., Matsui, K., and Wu, L.F. (2009). TatABC overexpression improves Corynebacterium glutamicum Tat-dependent protein secretion. Appl. Environ. Microbiol. 75: 603–607.
Palmer, T. and Berks, B.C. (2012). The twin-arginine translocation (Tat) protein export pathway. Nat. Rev. Microbiol. 10: 483–496.
Schneewind, O. and Missiakas, D. (2014). Sec-secretion and sortase-mediated anchoring of proteins in Gram-positive bacteria. Biochim. Biophys. Acta - Mol. Cell Res. 1843: 1687–1697.
Zhang, Y., Lamb, B.M., Feldman, A.W., Zhou, A.X., Lavergne, T., Li, L., and Romesberg, F.E. (2017). A semisynthetic organism engineered for the stable expansion of the genetic alphabet. Proc. Natl. Acad. Sci. 114: 1317–1322.
Berg, J.M., Tymoczko, J.L., and Stryer, L. (2012). Biochemistry 7th Edition. (Springer-Verlag: Berlin Heidelberg).
Kanehisa, M. and Goto, S. (2000). Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28: 27–30.
Isolation of Isoguanosine from Croton tiglium and Its Antitumor Activity. Jung Han Kim, Sang Jun Lee, Young Bok Han, Jung Jo Moon, Jong Bae Kim; Arch. Pharm. Res. Vol. 17, No.2 , pp. 115-118, 1994
Cherbuliez, E. and Bernhard, K., Croton seed(1)crotonoside. Helv. Chim. Act&, 15, 464, 978-982 (1932).
Huang, M., Shimizu, H. and Daly, J. W., Accumulation of cyclic adenosine monophosphate in incubated slices of brain tissue. J. Med. Chem., 15, 462-468 (1972)
Lowry, B. A. and Brown, G. B., The utilization of purine nucleosides for nucleic acid synthesis in the rat. J. BioL Chem., 197, 591 (1952).
Vasu, N. and David, A. Y., A New Synthesis of Isogua- nosine. J. Org. Chem., 50, 406-408 (1985).