Team:Bielefeld-CeBiTec/Results/translational system/library and selection

Library and Selection

Short Summary


The incorporation of a non‑canonical amino acid (ncAA) requires a tRNA/aminoacyl‑tRNA synthetase (tRNA/aaRS) pair which is able to accept and bind the ncAA to charge the tRNA with the ncAA. Therefore, our aim is to generate a library of ncAAs with different binding sites. The aaRS, based on the wild type Methanococcus jannashii tyrosyl‑tRNA synthetase , is orthogonal to the tRNA/aaRS from E.coli and suitable to incorporate novel ncAA . To generate a library with random mutated amino acid binding sites we generated a template plasmid with an optical control and cloned the tyrosyl‑tRNA synthetase library, by using single stranded DNA annealing with randomized oligonucleotides. The library consists of approximately 150,000 library plasmids, including more than 27,672 variants of the tyrosyl‑tRNA synthetase. The tyrosyl‑tRNA synthetase library is the basis for positive and negative selection cycles , where an optimal adapted tyrosyl‑tRNA synthetase variant can be obtained.

Generating the library

One approach to obtain an orthogonal tRNA/aminoacyl tRNA synthetase (tRNA/aaRS) pair is based on directed evolution. To obtain a large number of tRNA/aminoacyl tRNA synthetase (tRNA/aaRS) variants, a library of aaRS variants with randomized codons at key positions can be used. This library serves as a basis for selective processes or screenings to identify an optimal tRNA/aaRS pair. The library we build (BBa_K2201411) is based on the pSB1C3 plasmid, with the tyrosyl-tRNA synthetase (TyrRS) under control of a is based on the pSB1C3 plasmid, with the tyrosyl-tRNA synthetase (TyrRS) under control of a PglnSpromoter inserted between the iGEM BioBrick prefix and suffix. For vector contsruction, we used the primers 17jj and 17jk to amplify of the pSB1C3 backbone and the primers 17hq and 17ht for the amplifyof TyrRS.

Figure 1: Library plasmid pSB1C3 containing the tyrosyl tRNA synthetase.
Library plasmid based on pSB1C3, containing the tyrosyl tRNA synthetase under control of a PglnS promotor, a pMB1 origin of replication and chloramphenicol resistance.

For the generation of the tyrosyl-synthetase library, we used a method based on the dimerisation of two randomized oilgomers which overlap at their 3’ ends. Using Klenow fragment, a purchasable polymerase I fragment from Escherichia coli which features a 5’-3’ polymerase activity and a 3’-5’ endonuclease activity (Beese et. al., 1993). the partially double-standed dimer generated from the oligomers is filled up to a full double strand. This small dsDNA fragment containing the randomized position can then be integrated in the target sequence using Gibson assembly.
If a marker sequence is inserted in the position which should be randomized, the insertion of this certain sequence can be easily screened. We used a mRFP (BBa_J04450) under control of a lac promoter, lac operator and rrnB T1 terminator as an optical control. The primers 17vi and 17vj were designed with overlaps, homologous to the sequence around the binding pocket region synthetase sequence, allowing an optimal binding into the TyrRS. The position of TyrRS chosen to be randomized are Asp158, Ile159, and Leu162, the positions of the center of the binding pocket. The mRFP is inserted into the TyrRS in place of this binding site to function as an optical control. If the randomized DNA double strand is incorporated into the synthetase, the colony color changes due to the absence of the mRFP, allowing to easily pick E. coli cells containing the randomized library plasmids for further positive and negative selection.

Figure 3: Generating a synthetase library by using oligo dimers and mRFP as an optical control.
Two oligomers, one with a randomized codon at a given position, are designed to form a dimer (1), which is completed to dsDNA using Klenow fragment. The region of tyrRS meant to be modified by randomization is replaced by mRFP as an optical control (2). In the case of the incorporation of the randomized dsDNA, the mRFP is replaced and thus the incorporation is directly visible



Generating a tyrosyl tRNA/synthetase (TyrRS) library using the NNK scheme for the randomization of three positions of the binding pocket leads to a large variety of different sequences. When randomizing three codons using the NNK scheme, a total of 32,768 different sequence variants can be obtained. Considering the degenerate genetic, i.e. several codons coding for the same amino acid, this results in 8,000 different possible amino acid sequences, having influence on the structure of the binding pocket of TyrRS. In order to assess the number of colonies needed that each of the possible 32,768 variants is represented at least once in the library we determined with the free statistical software R (R coreteam 2015), a library size of 393,447 plasmid-carrying colonies to satisfy this criterion.

We generated the TyrRS library with Gibson Assembly, after transformation we platet them out on LB-plates with chloramphenicol. Altogether, we received more than 130,000 colonies. In evidence due to th optical controle of the template used for the not randomized TyrRS plasmid backbone, we could easily determine the negative colonies. As depicted in
Figure 4, 48 of 1,310 colonies approximately did not contain the randomized TyrRS library plasmid. Extrapolating this data, we received approximately 125,236.64 library plasmids out of 130,000 colonies, showing a cloning efficiency of 96,34 %, offering a wide diversity of different TyrRS variants.

Figure 4: Tyrosyl-tRNA/synthetase library on LB-plate with chloramphenicol.
The library was generated by using two primers, one with a randomized position, which are designed to form a dimer. This dimer is completed to a dsDNA by the Klenow fragment. As optical control, a mRFP is incorporated in this certain position to be ranomized, which is then replaced by the dsDNA. On this basis, we could see that 3,66 % of the cells incorporated the template plasmid, but most cells contained the plasmid with the incorporated randomized dsDNA.

Anaylzing the tyrosyl tRNA/aminoacyl-synthetase library

Sequencing by Sanger

In order to assess the diversity of the library of our at this time 70,000 colonies containing colony, we used Sanger sequencing as a first test. The library was grown in 4 replicates, plasmid DNA was isolated from each replicate and subjected to Sanger sequencing (Figure 3). The results confirmed that the library shows a random NNK distribution at the wobbled positions.


Figure 3: Chromatograms of the Tyr-RS library by Sanger sequencing.
Four TyrRS library replicates were sequenced forward and reverse by Sanger sequencing. The positions 158, 159, 162 of the TyrRS are randomized by NNK scheme.

The different signal intensities are caused by the characteristics of the labeling fluorophores in Sanger sequencing. In general, when using four-colour labeling on the dNTPs, the signal-to- noise ratio is reduced because of the spectral overlap of the fluorescence emission. This results in shorter and less accurate reads (Middendorf et.al.,2008). In addition to that, the excitation with one single wavelength compromises the sequencing results. That is due to the wide range of the varying absorption and fluorescence emission spectra (500 800 nm) of the used fluorophores by a laser excited fluorescence of approximately 488 nm (Pfeufer et.al.,2015).

Regarding the chromatograms, depicted in Figure 3, the maximal fluorescence intensity of the thymidine is approximately 75 % up to 90 % lower than the maximal fluorescence intensity of the guanine. In comparison, the maximal fluorescence signal of the guanosine shows up to 97 % of the approximate maximal fluorescence signal of the cytosine.

Illumina Sequencing

Figure 4: Starting the Illumina MiSeq sequencing.

To obtain more meaningful sequencing data that allows to determine the total number of sequence variants in our library, we used the Illumina Next Generation Sequencing (NGS) technique. Illumina NGS is based on the binding of ssDNA fragments to adapters fixed on the surface on the flow cell. Using a process called bridge amplification, the fragments are amplified in situ, forming a small cluster of the same DNA fragment for each initial fragment. For sequencing, the primer, polymerase and four types of labeled reversible terminator dNTPs are added to the flow cell. By excitation with a laser after every incorporation, the specific nucleotide incorporated in each cluster can be detected through the specific fluorescence. By repeating this cycle numerous times, the accurate sequence of bases in each cluster and thus from each initial fragment can be determined.



We generated primers containing the necessary adapter sequences as well as an unique index to separate our sequences from other libraries in the same run. After amplification, the PCR product was purified from a 1 % agarose gel. The quality of the library amplificate was controlled using the Agilent BioAnalyzer with a High-Sensitivity DNA chip. This technology uses capillary electrophoresis for a sensitive quantification and sizing of DNA fragments. The electropherogram of the Agilent BioAnalyzer High Sensitivity DNA Assay (Figure 4) shows our amplified library fragment as the largest peak with a length of 563 bp and flanked by the two markers (35 bp and 10,380 bp). Based on the reference peaks, the concentration of the library was determined asof 2,874.13 pg/µL (7,735.3 pmol/L).

The MiSeq sequencing delivered a total of 1,782,403 paired reads. Using a pattern search, the region containing the NNK motif could be identified in 1,650,024 reads. Of the 32,768 possible sequence variants, 30,440 could be found at least once. In order to determine the minimal coverage needed to verify a certain variant, we also searched with an NNN motif, which delivered an additional 2,529 reads for 1,323 motif variants, 1,181 of which were just found once. For this reason, we consider all variants found using the NNK motif with a coverage of 1 as possible false instances. Excluding the 2,768 possible false instances from the initial set of 30,440 different variants, results in a total library diversity of at least 27,672 different sequence variants.

Considering that we continued the generation of the library after sequencing, nearly doubling the number of clones, we assume the tyrosyl-tRNA/synthetase library to be larger than the analyzed 27,672 different sequences and 8,000 peptides.



Figure 5: Electropherogram of the tyrosyl- synthetase library.
The Agilent BioAnalyzer High Sensitivity DNA Assay is used for the measurement. The library fragments are depicted as the peak in the center (563 bp), flanked by markers.

Figure 6: Gel image of the Agilent BioAnalyzer High Sensitivity DNA Assay of the tyrosyl- synthetase library.



We were not allowed to submit the complete library, therefore we submitted two versions of the basis library Plasmid ( BBa_K2201400 , BBa_K2201411 ) for the generation of a own library.
In addition, the complete library is available to all future iGEM teams upon request.

Selection

Screening the whole library for its ability to incorporate the desired ncAAs enzymatically would be too time and cost-consuming. We decided to create a high throughput method for the selection of the clones that incorporate the target ncAA. The selection system is based on two selection steps that have to be repeated several times. In the first step, the positive selection, all clones that incorporate amino acids in response to the amber codon survive. The second step, the negative selection selects for specifity of the tRNA/aminoacyl-synthetases. For both selection steps the library is co-transformed with the selection plasmid in pSB3T5 to prevent incompatibility to the library plasmid in pSB1C3. The plasmid maps of the selection plasmids are shown below as BioBricks (Figure 7 & Figure 8).

The positive selection plasmid (BBa_K2201900) contains the Methanococcus jannaschii based tRNA (CUA) with an anticodon for the amber codon under the constitutive promoter PproK. The essential part for the selection is the kanamycin resistance with two amber codons behind the translation start. If the tRNA/aminoacyl-synthetase mutant, encoded on a co-transformed library plasmid, is able to charge the tRNA (CUA) with any amino acid, the cell can express the kanamycin resistance. Thus, these cells survive when plated out on LB agar plates containing the ncAA and kanamycin. Our goal was to generate a tRNA/synthetase which is able to incorporate 2‑Nitro‑L‑phenylalanine, used for the photocleaving of the polypeptide backbone.
For the first round of selection step, we cotransformed the library plasmid (BBa_K2201411) with the (BBa_K2201900) and cultivated the cells on LB-plates with kanamycin and 2‑nitrophenylalanine (2‑NPA). To avoid an additional pressure on the cells, we did not use tetracyclin or chloramphenicol for the cultivation, as expression of the kanamycin resistance depends on the presence of both, the library plasmid and the positive selection plasmid.

We combined the positive selection plasmid with an enhancement system (BBa_K2201373) , containing a T3 RNA polymerase with a reversed mRFP under T3 promoter control. With this system, the mRFP is expressed, resulting in a red colour of the colonies, still owning this positive selection plasmid. Thereby, it was possible to easily identify the clones owning the positive and not the negative selection plasmid while the negative selction. As it can be seen in Figure 10, the transformation efficiency of the positive selection plasmids, in contrast to the library plasmids, is low, resulting in one single false colony owning the positive selection plasmid.

After the positive selection, we received approximately 800 colonies, indicating that many of our generated TyrRS variants are able to bind either 2‑NPA / or a canonical amino acid. We washed these colonies off the plates and isolated the plasmids to use them for the negative selection cycle.

Figure 7: Positive selection plasmid BBa_K2201900 .
Positive selection plasmid for the incorporation of ncAAs The positive selection plasmid contains a tRNA and a kanamycin resistance with two amber codons. Cotransformed with the library of tyrosyl‑tRNA synthetase with random mutated binding sites, on kanamycin only the clones survive that could charge any amino acid to the tRNA in response to the UAG codon.

Figure 8: Negative selection plasmid Ba_K2201901
negative selection plasmid against the incorporation of endogenous amino acids. The negative selection plasmid contains an tRNA with the anticodon for the amber codon and a barnase containing amber codons at permissive sites. In the negative selection the target amino acid is not supplemented to the media. If the cotransformend clones from the positive selection charge endogenous amino acids to the tRNA, the cells die. This provides a selection method for high specific aaRS.

In the negative selection cycle, all TyrRS variants that incorporate any of the endogenous amino acids results in cell death. Therefore, the negative selection plasmid (BBa_K2201901) contains a toxin for E.&nbsp,coli , the barnase. Two amber codons are incorporated at permissive sites of the barnase and the plasmid contains the same tRNA (CUA) as the positive selection plasmid. In contrast to the positive selection, the cells are plated out on agar not containing the ncAA. Thus all which charge the tRNA with canonical amino acids will charge the tRNA (CUA). These cells express the barnase and die.

We co-transformed the library plasmids from the positive selection cycle with the negative selection plasmid (BBa_K2201901) and cultivated the cells on LB-plates with tetracycline and chloramphenicol to be certain to retain both plasmids.

After the positive selection, we received approximately 800 colonies, showing that many of our generated TyrRS variants are able to bind a non canonical or endogenous amino acid despite the modifications. We washed these colonies off the plates, isolated the plasmids and cotransformed them with the negative selection plasmid (BBa_K2201901) and cultivated the cells on LB-plates with tetracycline and chloramphenicol to be certain to attain both plasmids.




Figure 9: Remaining colonies while the positive selection for 2-NPA, containing the positive selection plasmid and the library plasmid.
The remaining cells own an aaRS able to bind a non canonical or endogenous amino acid.

Figure 10: Remaining colonies while the negative selection and after the positive selection for 2-NPA, containing the negativ selection plasmid and the library plasmid.
The remaining cells contain an aaRS, specific to not bind an endogenous amino acid. Red colonies contain, due to the strenghtening system, a positive selection plasmid as result of the plasmid isolation and can be easily removed for further selection rounds.

Due to the expression of the barnase when an canonical amino acid is incorporated, only cells, owning an aaRS which is as specifically that it does not bind an endogenous amino acid survived. Therefore, we received < 100 colonies after the negative selection, showing a loss of more than 80 % of the aaRS candidates with which we first started the positive selection.
As survival during the negative selection cycle can also be casued by loss of function mutations, repeating the positive and negative selection for at least three cycles is necessary to prevent false positive results. Due to the enhancement system, only one round of selection is necessary, because the cells, containing the positive selection plasmid can be easily selected by color.

Outlook – Additional Possible Selection Systems

We could show that our designed high throughput selection method does indeed work and yields promising results. After generating our synthetase library, we performed the described selection process to acquire library mutants with the highest specifity to a given non-canonical amino acid with still intact orthogonality to its corresponding tRNA. Our selection is performed with positive and negative selection plasmids with an antibiotic resistence and a toxic gene with amber stop codons, respectively. We used a kanamycin resistence gene for the positive selection plasmid (BBa_K2201900) and a barnase gene for the negative selection plasmid (BBa_K2201901) (Figure 7 and 8). Both plasmids had a low copy origin of replication (pSB3T5) to not burden the cells with the overexpression of the genes contained on the plasmid.

While we designed our selection method, we also constructed multiple additional parts to build further selection plasmids. These plasmids could not be assembled due to time constraints, but the we provide the iGEM community with the basic and composite parts we build for them. Hence any team wanting to perform selection processes for incorporating non-canonical amino acids can assembly selection plasmids based on their needs. Following we briefly describe the parts and suggest possible combinations for the positive and negative selection plasmid, respectively.

The most important part needed for both selection plasmids is the tRNAtyr (BBa_K2201408). It is the orthogonal tRNA pair to the synthetase library (BBa_K2201409) without any amber codons. This coding sequence can be adapted to be used with the promoter of choice and the amber codons at a desired location. Furthermore we provide the kanamycin resistance with two amber codons (K2201410) under the control of an araBAD promoter ( BBa_K808000).

Another approach for positive selection is a visual conformation. This can be achieved with a fluorescent protein. Based on Santoro et al. we provide a T7 RNA polymerase (T7 RNAP) and a GFP under control of the T7 promoter (Santoro et al., 2002). The T7 RNAP (BBa_K2201405) is under control of the inducible araBAD promoter ( BBa_K808000). Also we provide the coding sequence of T7 RNAP (BBa_K2201403) from the E. coli KRX strain without amber codons to be adapted as needed. The needed GFP under the T7 promoter can already be found in the PartsReg (BBa_I746909). We changed the orientation of this part BBa_K2201404to avoid the probability of the GFP being expressed due to read through from the promoter of the endogenous RNA polymerase.

Figure 11: Possible Positive Selection Plasmid based on a Resistance Gene. The postive selection plasmid based on resistance containes the tRNAamber and the kanamycin resistance gene with two amber codons (S133Am and S154Am). The parts are shown here in pSB1C3.

Figure 12: Possible Positive Selection Plasmid based on Fluorescence. The positive selection plasmid based on fluorescence containes the tRNAamber, the T7 RNA polymerase with an amber codon and uvGFP under control of the T7 promoter. The parts are shown here in pSB1C3.

All parts for the positive selection could also be combined and used simultaneously (Liu and Schultz, 2010).

Figure 13: Possible Combined Positive Selection Plasmid based on a Resistance Gene and Fluorescence. This postive selection plasmid containes the tRNAamber, the T7 RNA polymerase with on amber codon, uvGFP under control of the T7 promotor and the kanamycin resistance gene with two amber codons (S133Am and S154Am). The parts are shown here in pSB1C3.

For the negative selection plasmid we just provide barnase as a toxic gene due to ccdB not being allowed anymore in the iGEM competition (Umehara et al., 2012). The coding sequence of barnase contains two amber codons (BBa_K2201406) (Liu and Schultz, 1999). We also combined the coding sequence with the inducible araBAD promotor ( BBa_K808000) and reversed the whole sequence. This we did to again avoid the probability of read through from the promoter of the endogenous RNA polymerase

Figure 14: Possible Negative Selection Plasmid. The negative selection plasmid containes the tRNAamber and barnase with two amber codons (Q4Am and D46Am). The barnase is under control of a araBAD promoter. The parts are shown here in pSB1C3.

Interestingly the question was raised if the library selection may be influenced by the used selection plasmids and selection protocols used (Zhao and Arnold, 1997; Umehara et al., 2012; Guo et al., 2014). Umehara et al. could show that different selection plasmids influence the mutation profile yielding different synthetases than previously reported for a given library. Recently the Romesberg lab demonstarted, that a combination between positive selection and deep sequencing was enough to yield functional and specific synthetases (Zhang et al., 2017). During her stay in Berlin Olga learned that the antibiotics concentration in the multiple positive selection steps also influences the selection, because the incorporation efficiencies of the synthetase mutants differs. Generally the concentration of the main antibiotics (backbones) are kept constant throughout the selections but the antibiotics concentration for the resistance gene containing the amber codons is raised with every positive selection step. This is done because it is assumed that the incorporation efficiency of the synthetases increases with every selection step. Further research concerning this diverse aresa might provide compelling results for further projects.

We hope our part collection aids coming iGEM teams in constructing their own selection processes.

References

Pfeufer V., Schulze M. , (2015). Laser fluorescence powers sequencing advances. BioOptica World Beese,L.S, Derbyshire V, Steitz T.A. (1993). Structure of DNA Polymerase I Kienow Fragment Bound to Duplex DNA
Middendorf L.R., Humpfrey P.G., Narayanan N., Roemer S.C.. (2008) Chapter8 Sequencing Technology. WILEY Guo, L.-T., Wang, Y.-S., Nakamura, A., Eiler, D., Kavran, J.M., Wong, M., Kiessling, L.L., Steitz, T.A., O’Donoghue, P., and Söll, D. (2014). Polyspecific pyrrolysyl-tRNA synthetases from directed evolution. Proc. Natl. Acad. Sci. U. S. A. 111: 16724–16729.
Liu, C.C. and Schultz, P.G. (2010). Adding New Chemistries to the Genetic Code. Annu. Rev. Biochem. 79: 413–444.
Liu, D.R. and Schultz, P.G. (1999). Progress toward the evolution of an organism with an expanded genetic code. Proc. Natl. Acad. Sci. 96: 4780–4785.
Neumann, H., Peak-Chew, S.Y., and Chin, J.W. (2008). Genetically encoding N(epsilon)-acetyllysine in recombinant proteins. Nat. Chem. Biol. 4: 232–234.
Santoro, S.W., Wang, L., Herberich, B., King, D.S., and Schultz, P.G. (2002). An efficient system for the evolution of aminoacyl-tRNA synthetase specificity. Nat. Biotechnol. 20: 1044–1048.
Umehara, T., Kim, J., Lee, S., Guo, L.-T., Söll, D., and Park, H.-S. (2012). N-Acetyl lysyl-tRNA synthetases evolved by a CcdB-based selection possess N-acetyl lysine specificity in vitro and in vivo. FEBS Lett. 586: 729–733.
Wang, L., Brock, A., Herberich, B., and Schultz, P.G. (2001). Expanding the Genetic Code of Escherichia coli. Science 292: 498–500.
Zhang, Y., Lamb, B.M., Feldman, A.W., Zhou, A.X., Lavergne, T., Li, L., and Romesberg, F.E. (2017). A semisynthetic organism engineered for the stable expansion of the genetic alphabet. Proc. Natl. Acad. Sci. 114: 1317–1322.
Zhao, H. and Arnold, F.H. (1997). Combinatorial protein design: strategies for screening protein libraries. Curr. Opin. Struct. Biol. 7: 480–485.