Team:Bielefeld-CeBiTec/Project/unnatural base pair/uptake and biosynthesis

Uptake and Biosynthesis of iso-CmTP and iso-GTP

Strategies to Supply the Cell with iso-CmTP and iso-GTP

The first step to ensure that unnatural base pairs are retained over a long period of time is to provide the cell with the unnatural bases needed for the replication of the unnatural base pair. There are two different strategies to provide the cell with the unnatural bases. The first one is to supplement the media with the nucleoside triphosphates of the desired unnatural bases. To allow uptake of the nucleoside triphosphates, the cell needs to possess a transporter that can facilitate a sufficient flux into the cell. While this strategy seems to be the easiest one, it also has some disadvantages. First, Escherichia coli does not possess nucleoside triphosphate transporters suitable for the transport of unnatural nucleoside triphosphates. Given that nucleoside triphosphates such as ATP are essentially absent from the extracellular environment, such transporters have not evolved. Another reason why nucleoside triphosphate transporters are basically absent from the cell membrane is that the existence of such could potentially put the intracellular ATP pool at risk. Therefore, a heterologous transporter has to be introduced. Secondly, chemically synthesized nucleotides are very expensive, so feeding them to the cells would make bioprocesses uneconomically for scale up. On the other hand, this strategy allows the testing of highly unique unnatural bases if uptake can be ensured. The second strategy is based on the in vivo biosynthesis of the desired unnatural bases. This strategy would allow scaling up of processes without making them uneconomical and would eliminate the need for a heterologous transport system. But especially the biosynthesis of fully synthetic bases is, if possible at all, extremely challenging. To allow the incorporation of those bases into the DNA, phosphorylation of the nucleosides needs to be ensured which is facilitated by in part highly specific kinases. Therefore, this strategy represents an ideal system which as of now is extremely challenging to implement.

Figure 1: Two strategies for making iso-CmTP and iso-GTP available to the cell.
The first strategy is based on a heterologous transporter which can facilitate the transport of iso-CmTP and iso-GTP. The media is supplemented with the unnatural nucleosidtriphosphates which are then transported into the cell and incorporated into the DNA. The second strategy is more complex and is based on the de novo synthesis of the unnatural nucleosidtriphosphates. Therefore, existing pathways from natural sources or newly designed pathways have to be introduced into the cell. The nucleosidtriphosphates are then incorporated into the DNA.

The Nucleotide Transporter from Phaeodactylum tricornutum PtNTT2

Given their structural similarity to the natural bases, iso-dCm and iso-dG might be taken up through the existing nucleoside transporters in sufficient amounts. But phosphorylation of the nucleosides remains a challenge. Therefore, the introduction of a heterologous nucleoside triphosphate transporter into the desired strain represents a promising alternative.

Phaeodactylum tricornutum, a diatom of the genus Phaedactylum, features six putative nucleotide transporters (NTTs). Two isoforms of these NTTs have been characterized by Ast et al. 2009 and it was shown that both isoforms facilitate transport across the plastid membrane. While isoform 1 (NTT1) acts as a proton-dependent adenine nucleotide importer, NTT2 facilitates the counter exchange of (deoxy-)nucleoside triphosphates (Ast et al., 2009).

Figure 2: Uptake of α32-labeled nucleotides by the two isoforms PtNTT1 and PtNTT2 when expressed in E. coli (Ast et al., 2009)
The uptake of α32-labeled nucleotides was measured in E. coli. Isoform 1 (A) was shown to transport adenosine mono-, di- and triphosphates, while isoform 2 (B) shows a specificity for nucleoside triphosphates.

Ast and colleauges demonstrated the subcellular localization by tagging the transporters with GFP. Figure 2 shows the subcellular localization of the two isoforms PtNTT1 and PtNTT2. The fluorescence signal can only be detected within the cell around the plastid and not within the cell membranes. The isoform 2 of the nucleotide transporter was shown to be an unspecific (deoxy-)nucleoside transporter, facilitating the uptake of CTP, GTP, dCTP, ATP, UTP, dGTP, dATP and TTP when expressed in E. coli, as shown in figure 3 (Ast et al., 2009). Uptake was measured using α32-labeled nucleotides. While isoform 1 (A) can transport adenosine mono-, di- and triphosphates, isoform 2 (B) shows a specificity for nucleoside triphosphates. The fact that PtNTT2 can accept a broad range of different nucleotides makes the transporter interesting for the transport of unnatural nucleotides, as long as they are provided as nucleoside triphosphates.

Figure 3: Subcellular localization of PtNTT1 and PtNTT2 in Phaeodactylum tricornutum
PtNTT1 and PtNTT2 were fused to GFP to study the subcellular localization (Ast et al., 2009).

The fact that PtNTT2 can accept a broad range of different nucleotides makes the transporter interesting for the transport of unnatural nucleotides.

Zhang et al. 2017 integrated PtNTT2 chromosomally in E. coli BL21 (DE3) under control of the lacUV5 promoter. To demonstrate its feasibility for the uptake of nucleotides, uptake of [α-32P]-dATP was measured. The native sequence of PtNTT2 features an N-terminal signal sequence directing the subcellular localization to the plastid membrane. In E. coli, this signal sequence is likely to be retained, leading to a growth defect in cells expressing the native PtNTT2 transporter. Therefore, a truncated version of PtNTT2, PtNTT2(65-575), was used. The chromosomally integrated, truncated, and codon optimized PtNTT2 (65-575) under control of PlacUV5 was shown to be an optimal compromise between efficient uptake and the growth limitation resulting from expression of the heterologous protein (Zhang et al., 2017).

Figure 4: Uptake of α32labeled ATP by the different versions of PtNTT2 (Zhang et al., 2017).
The expression of PtNTT2 was investigated in different strains, under control of different promotors, and plasmid-bound as well as integrated into the chromosome. In their final design, Zhang and colleagues integrated PtNTT2 chromosomally in E.coli BL21 (DE3) under control of the lacUV5 promoter

For our experiments, we decided to use PtNTT2 to facilitate the transport of unnatural nucleoside triphosphates into the cell. Therefore, our plan is to construct several versions of PtNTT2 under control of the lacUV5 promoter, including the native sequence PtNTT2(1-575), two truncated versions PtNTT2(31-575) and PtNTT2(66-575), as well as two versions with heterologous signal peptides, PtNTT2-Tat and PtNTT2-pelB. While PtNTT2-Tat features a signal peptide targeting the twin-arginine translocation pathway, PtNTT2-pelB features the signal peptide of the pectate lyase B of Erwinia carotovora, which targets the general secretion pathway. By characterizing all of these versions we hope to identify the best suited version for our purpose, the efficient uptake of unnatural nucleoside triphosphates.

Sec- and Tat Secretion Pathways

In E. coli , proteins can be secreted via two main pathways, namely the gerneral secretion (Sec) and the twin-arginine translocation (Tat) pathway . Proteins which are intended to be secreted are targeted by signal peptides which are located at the N-terminal end of a pre-protein and share one common tripartite structure: a positively charged N-terminal domain (N-domain) followed by a nonpolar hydrophobic domain (H-domain) and the more polar C-domain (De Keyzer et al., 2003; von Heijne, 1998). Signal peptides vary in length, but are usually between 18 and 30 amino acids long. During the translocation process, the signal peptide is cleaved from the pre-protein by a signal peptide peptidase, which recognizes a cleavage site within the C-domain of the signal peptide (De Keyzer et al., 2003). Signal peptides targeting the Sec pathway do not feature a common consensus sequence or motif (Caspers et al., 2010). However, the binding of the secretion specific chaperone SecB or the signal recognition particle SRP (which represents the first step of the Sec pathway) to the signal peptide is influenced by the hydrophobicity of the H-domain (De Keyzer et al., 2003). Signal peptides targeting the Tat pathway are usually around 30 amino acids long (Berks et al., 2014). On the contrary to Sec pathway targeting signal peptides, Tat pathway targeting signal peptides do feature a consensus sequence. The crucial element for Tat recognition is a twin-arginine motif (RR) located at the N/H-domain junction. In bacteria the RR is usually followed by phenylalanine, leucine and lysine at the RR+2 position with especially the phenylalanine being of high importance (Cline, 2015).

Sec Pathway

In bacteria, most proteins are secreted via the conserved and essential Sec pathway (Schneewind and Missiakas, 2014; Caspers and Freudl, 2008). The key component of the Sec pathway is the translocase which is located in the plasma membrane and composed of the SecYEG heterotrimeric protein as well as the SecA ATPase. SecYEG consists of three components, SecY, SecE and SecG. SecY and SecE form the core of the transmembrane channel, while SecG is an additional integral subunit that seems to be not essential for the translocation process (Schneewind and Missiakas, 2014; de Keyzer et al., 2003; Caspers and Freudl, 2008). SecA serves as the molecular motor of the translocation process by consecutively binding and hydrolyzing ATP and hence initializing the translocation. The proton motive force (PMF) helps translocating the protein once the process is initialized by SecA (Caspers and Freudl, 2008). SecYEG can associate with another heterotrimeric complex involved in the translocation process consisting of SecD, SecF and YajC. The exact role of this additional heterotrimeric complex is yet unknown but seems to optimize secretion (de Keyzer et al., 2003). It was proposed that SecDFYajC might be involved in the proton motive force dependent steps of the translocation process (Denks et al., 2014). Proteins are targeted in two different ways, the first being mediated by the chaperone SecB, the second step by the signal recognition particle (SRP) . SecB and SRP both bind the nascent pre-protein but seem to bind pre-proteins of different lengths. Proteins that are to be secreted are usually targeted via the SecB mechanism (de Keyzer et al., 2003). No matter which mechanism is involved in the targeting process, the proteins eventually interact with the translocase. Gradual translocation of the pre-protein is catalyzed by the translocase, during which the pre-protein is further processed by the signal peptide peptidase (SPase). The SPase is embedded in the periplasmic membrane and removes the signal peptide of the pre-protein (Denks et al., 2014; de Keyzer et al., 2003).

Figure 5: Schematic overview over the Sec-pathway and the involved components.
Pre-proteins are either targeted by SRP or SecB and translocated by the translocase. The translocase consists of SecYEG, forming the transmembrane channel, and SecA, the ATP-driven molecular motor of the translocation. The signal peptide is eventually removed by the SPase (here Lep) (de Keyzer et al., 2003).

Tat Pathway

Opposite to the Sec pathway, which secretes proteins while they are not fully folded, proteins secreted via the twin-arginine translocase (Tat) pathway are fully folded (Cline, 2015). The Tat pathway is an active transport system with the transmembrane proton motive force being the driving force of the translocation (Palmer and Berks, 2012). Three proteins from only two structural families form the machinery needed for Tat mediated secretion (Berks et al., 2014). The first family called TatA is described as a single transmembrane helix with a short N-terminal transmembrane domain and an unstructured C-tail (Berks et al., 2014; Cline, 2015). TatB belongs to the TatA family but was shown to be abdicable for the basic functionality. The second structural family involved in the Tat pathway is TatC, consisting of six transmembrane helices with the N- and C-termini being exposed to the cytoplasm. The structure of TatC was determined for Aquifex aeolicus and is shown together with the structure of TatA from E. coli in Figure 6 (Berks et al., 2014; Cline, 2015). A fourth component, TatE provides and overlapping function to TatA, hence making TatE dispensable (Kikuchi et al., 2009).

Figure 6: Structures of TatA from E. coli and TatC from Aquifex aeolicus
While TatA forms a single transmembrane helix, TatC consists of six transmembrane helices(Berks et al., 2014).

TatB and TatC form the receptor complex, which is composed of up to eight TatBC units (Cline, 2015). The signal peptide is bound by this receptor complex, which triggers the PMF dependent assembly of TatA to the translocation pore (Palmer and Berks, 2012; Cline, 2015). The whole complex is referred to as translocase. After the assembly, the substrate is translocated and the signal peptide is cleaved by a signal peptide peptidase as shown in Figure 7. Many details of the Tat pathway are not known yet or uncertain, including how TatA disassembles (Cline, 2015).

Figure 7: Steps of the Tat translocation system
After the signal peptide is bound by TatBC TatA assembles and form the translocation pore. Using the transmembrane proton motive force, the substrate is translocated and the signal peptide cleaved by a signal peptidase (Palmer and Berks 2012)

References

Ast, M., Gruber, A., Schmitz-Esser, S., Neuhaus, H.E., Kroth, P.G., Horn, M., and Haferkamp, I. (2009). Diatom plastids depend on nucleotide import from the cytosol. Proc. Natl. Acad. Sci. U. S. A. 106: 3621–3626.
Zhang, Y., Lamb, B.M., Feldman, A.W., Zhou, A.X., Lavergne, T., Li, L., and Romesberg, F.E. (2017). A semisynthetic organism engineered for the stable expansion of the genetic alphabet. Proc. Natl. Acad. Sci. 114: 1317–1322.