Team:Bielefeld-CeBiTec/Project/unnatural base pair/preservation system

Retention and Preservation System

Short Summary

The retention and preservation of unnatural nucleoside triphosphates and unnatural bases in the DNA requires a cellular retention and preservation system acting on multiple cellular levels. On one hand, degradation of the unnatural nucleoside triphosphates by enzymes like cytosine deaminase must be prohibited for efficient integration into the DNA. On the other hand, once the unnatural bases are incorporated into the DNA, loss or change of the unnatural bases have to be prevented. We want to achieve this by adapting CRISPR/Cas9 to detect and prevent mutations of the unnatural bases.

Preservation system using Cas9

Due to tautomerisation of isoG and hydrolysis of isoC^m and the resulting loss of the unnatural base pair (UBP), there is a need for a system, to preserve the UBP on the plasmid. In 2017, Zhang et al. successfully deployed a CRISPR (clustered regularly interspaced short palindromic repeat)-Cas9 system for retention of a UBP. We adapted this conservation system to our UBP and thus used CRISPR/Cas9 to eliminate all plasmid DNA that had lost the UBP.
The nuclease Cas9 is part of the adaptive immune system of Streptococcus pyogenes, where it induces double strand breaks in the genomic DNA. This enzyme is recruited by a CRISPR RNA (crRNA). A crRNA consists of direct repeats interspaced by variable sequences called protospacer. Those protospacers are derived from foreign DNA and encode the Cas9 guiding sequence (guide RNA). An auxiliary transactivating crRNA (tracrRNA) helps processing the precursor crRNA array into an active crRNA that contains the 20 nucleotide guide RNA. The guide RNA binds to the complementary genomic DNA sequence via Watson‑Crick base pairing. For this binding, the genomic DNA sequence needs to be located upstream of a CRISPR type II specific 5’ NGG protospacer adjacent motif (PAM). Synthetically chimeric single stranded guide RNA (sgRNA) was designed by combining crRNA and tracrRNA. In the sgRNA, only the 20 nucleotide guiding sequence needs to be exchanged for targeting any genomic sequence followed by a PAM sequence (Ran et al., 2013 a, b). The resulting double strand break introduced by Cas9 leads to exonucleolytic degradation of the DNA in prokaryotic cells (Simmon and Lederberg, 1972).
In our case we envision a retention system, where Cas9 cleaves Plasmids at sites where the UBP is absent. This works by using a sgRNA complementary to the DNA sequence without the UBP. In plasmids with the UBP present, the mismatch between isoG/isoC^m and sgRNA greatly decreases Cas9 activity (Zhang et al., 2017). In the event of UBP loss, the sgRNA now binds perfectly to the mutated site and restores Cas9 activity which leads to degradation of the mutated plasmid. Consequently, this leads to a retention of the UBP in the plasmids.

Figure 1: UBP conservation system using Cas9.
sgRNAs are targeted against every DNA sequence emerging from UBP loss on a plasmid. A: Loss of the UBP leads to a point mutation. Now a sgRNA can bind to the DNA target sequence. Cas9 is recruited and cleaves the plasmid, which is followed by its degradation. B: Plasmids that contain a UBP in the DNA target sequence lead to a mismatch with every sgRNA. Cas9 does not cleave the plasmid, leading to retention of the UBP.

Because the iGEM competition is based on plasmids that can be submitted to the parts registry we refrained from creating a novel strain for the UBP retention. Consequently, our retention system consists of two plasmids. The first plasmid (BBa_K2201010) contains the nucleotide transporter gene PtNTT2 and the cas9 on a pSB1K3 backbone. This plasmid was transformed into E. coli BL21(DE3). After producing chemically competent cells, another transformation was done with the second plasmid. The second plasmid carries the UBP and the sgRNAs for recruiting Cas9 on a pSB3C5 plasmid. Since two large plasmids produce a great metabolic stress for an organism, a more sufficient UBP retention could be achieved with a newly created E. coli strain, that has the genes PtNTT2 and cas9 in its genomic DNA. We provide the parts that are required to create a repair template for the knock in strategy using a CRISPR/Cas9 system. These parts were designed to perform a genomic integration into E. coli BL21(DE3). The genomic knock in can be done according to the protocol by Cobb et al., 2014 using the pCRISPomyces plasmid system. A sgRNA was designed searching for a reverse sequence with the constraint N(16)R(4)NGG (Cobb et al., 2004) inside the coding sequence of the arsB gene (Zhang et al., 2017) of the E. coli genome. arsB is coding for an arsenic efflux pump membrane protein. As a result, 5‘ TATTGTTCATAATAGAAGAG 3‘ turned out to be the suggested guiding sequence with the highest on target activity score with being unique within the complete genome. The required repair template is designed with 1 kb long flanking sequences and two terminators BBa_B0015 flanking the gene of interest for example cas9 (Figure 2). To be able to strictly regulate the expression of cas9 we rationally designed an optimized IPTG inducible promoter P_{lacO‑tight1}.

Figure 2: The repair template (BBa_K2201028) for genomic integration of cas9 into E. coli BL21(DE3).
The 1 kb left flanking sequence (LFS: BBa_K2201021) and right flanking sequence (RFS: K2201022) are taken from the genome of E. coli BL21(DE3). Inside the genome LFS and RFS are directly flanking the coding sequence of arsB. The strong terminators from BBa_B0015 were used to avoid basal expression and to stop transcription right after cas9. There are composite parts the consist of LFS + terminator + P_{lacO‑tight1} (BBa_K2201024) and terminator + RFS (BBa_K2201025) that can be assembled with any coding sequence of interest to create a repair template. The coding sequence of cas9 is taken from the pCRISPomyces plasmid system by Cobb et al., 2014 and is originated from S. pyogenes. It is negatively regulated by the IPTG‑inducible promoter P_{lacO‑tight1}.

This composite BioBrick from Figure 2 needs to be restricted with NotI to separate the linear repair template from the pSB1C3 backbone. Together with the target plasmid containing the sgRNA and the pCRISPomyces plasmid containing the cas9 all three elements were co‑transfected into E. coli BL21(DE3). The genomic integration was verified by sequencing.

Optimization of the negatively regulating promoter P_lac

We designed a tightly repressed lac operon called P_{lacO‑tight1} to achieve a low basal transcription rate. The lac operon was extensively researched starting from the midst of last century. This showed the role of the lac repressor and the inducible nature of the lac promoter, which results in 1000-fold increased expression in its activated state (Müller et al., 1996). The wild type lac operon consists of the genes lacZ, lacY and lacA that are transcribed from the lac promoter P_lac into a polycistronic mRNA (Figure 4). These genes code for the proteins β‑galactosidase, Lac permease and Lac transacetylase (Oehler et al., 1994).

Figure 3: DNA map of the wild type lac operon and its transcriptional and translational products (Oehler et al., 1994).

The transcription is constitutively activated by the CAP protein. lacI codes for the tetrameric Lac repressor and is expressed by the promoter P_i. The Lac repressor can bind to the lac operators O1, O2 or O3. O3 is located 92 bp upstream of O1 and O2 401 bp downstream of O1. The inter‑operator distances are counted from the center of O1 to the center of the distal operator. By binding simultaniously to O1 and O2 or to O1 and O3, the Lac repressor forms a DNA loop that negatively controlls the expression of P_lac (Oehler et al., 1994).

Figure 4: The wild type lac operators and its regulatory structures.
The tetrameric Lac repressor can bind either the lac operators O1 and O3 or O1 and O2 to form a DNA loop. The DNA loops efficiently inhibit the transcription by the CAP protein (Oehler et al., 1994).

In 1994, Oehler et al. showed that an inactivated O2 in its natural position does not decrease the repression by low amounts of tetrameric Lac repressor. Based on this result, we designed our P_{lacO‑tight1} without the lac operator O2. It was also shown that two weak operators result in a tighter repression than a single strong operator. This can be explained by the thermodynamic concept that a second operator increases the local concentration of the Lac repressor for the neighboring operator. As a consequence there is a higher probability of occupation for two operators by the Lac repressor leading to a tighter repression (Oehler et al., 1996). In 1983, Sadler et al. proposed a ideal lac operator O_id that binds the Lac repressor 10‑fold tighter than the natural strong lac operator O1. O_id is an inverted repeat of the left half of O1 (Figure 5).

Figure 5: The lac operator O_id.
The inversion is indicated by the arrow for the perfectly symmetric lac operator O_id. It is the inverted repeat of the left half of O1 (Sadler et al. 1983).

The operator‑DNA‑operator complex requires energy for the bending process to form a DNA loop. Additional energy for a torsion is required when the two lac operators lay on opposite sites of the helical DNA surface. Therefore, the DNA loop formation is energetically favoured for lac operators in phase (Müller et al., 1996). In 1996, Mueller et al. investigated the strength of repression for an inter‑operator distance of O_id and O1 from 57.5 bp up to 1493.5 bp. The repression values were compared to the repression by a single O1 at its natural position. A shorter spacing than 57.5 bp could not be examined due to the 35 box of the promoter. Phase dependency for the repression was observed for a spacing around 200 bp. That leads to the observation of periodically maxima for repression values (Figure 6).

Figure 6: Repression values dependent on inter‑operator distances between O1 and O_id.
The repression values refer to the repression of the chromosomal lacZ gene under the control of O1 at its natural position and O_id at the indicated position. With 50 tetrameric Lac repressors per cell the repression value is calculated by the specific activity of β‑galactosidase in absence of active Lac repressor divided by the specific activity of β‑galactosidase in the presence of active Lac repressor. The dashed line shows the repression value for a single natural O1 operator (Mueller et al., 1996).

The distance of 70.5 bp showed the strongest repression value which is 50‑fold higher than the natural repression. Repression drops sharply to 15‑fold at a 150.5 bp spacing and to threefold at around 600 bp. All inter‑operator distances beyond 600 bp kept a twofold increased repression value (Müller et al., 1996).
According to these results, our P_{lacO‑tight1} consists of the auxiliary operator O_id with a 70.5 bp spacing to O1 at its natural position. The residuray sequence like the P_lac was kept as in the natural lac operon taken from E. coli BL21(DE3) (Figure 7).

Figure 7: Tight lac operon P_{lacO‑tight1}.
The figure with its annotations was created with the software Geneious 10.0.8.

Deletion of codA

To retain the unnatural base pair and keep a sufficient level of unnatural nucleoside triphosphate in the cell, the degradation of the unnatural nucleoside triphosphates must be minimized. In E. coli, the gene codA codes for the cytosine deaminase, an enzyme of the pyrimidine metabolism. The cytosine deaminase catalyzes the reaction of cytosine to uracil. Furthermore, it can catalyze the deamination of isoguanosione and isocytosine. Isoguanosine is formed during oxidative stress by reaction of the radical oxygen species (ROS) •OH with adenine. Other products from reactions of adenine with ROS include 8-oxoadenine and 6-N-hydroxyaminopurine.

Figure 8: Reactions catalyzed by the cytosine deaminase.
A) The conversion of cytosine to uracil is a normal reaction step within the pyrimidine metabolism. B) Isocytosine is converted into uracil by the cytosine deaminase. C) Isoguanine is formed during oxidative stress. Cytosine deaminase catalyzes the reaction to the non-mutagenic xanthine.

In E. coli, the codA gene is part of the codBA operon. CodB codes for the cytosine permease, while codA encodes cytosine deaminase.The cytosine deaminase hydrolyses cytosine to ammonia and uracil by hydrolytic deamination. This reaction poses the only way how cytosine can the metabolized in E.coli(Danielsen et al., 1992). When isoguanosine and isocytosine are provided to the cell, they are converted to uracil in the wildtype. Therefore, it is necessary to knock out or delete the codA gene to enable the stable availability of isoguanosine and isocytosine.

Figure 9: Arrangement of the codBA operon.
codA is located downstream of codB and 1.3 kb in size. Both genes overlap by 11 bases. codB codes for cytosine permease, while codA codes for cytosine deaminase.

codA is 1281 bp in size, resulting in a protein composed of 427 amino acids after translation. The protein CodA is located in the cytosol and has an atomic mass of 47.5 kD.

Figure 10: Crystal structure of cytosine deaminase from Escherichia coli complexed with zinc and phosphono-cytosine.
The structure was determined by X-Ray crystallography with a resolution of 1.71 Å (Hall et al., 2011).

References

Cobb, R.E., Wang, Y., and Zhao, H. (2015). High-efficiency multiplex genome editing of Streptomyces species using an engineered CRISPR/Cas system. ACS Synth. Biol. 4: 723–8.

Danielsen, S., Kilstrup, M., Barilla, K., Jochimsen, B., and Neuhard, J. (1992). Characterization of the Escherichia coli codBA operon encoding cytosine permease and cytosine deaminase. Mol. Microbiol. 6: 1335–1344.

Hall, R.S., Fedorov, A.A., Xu, C., Fedorov, E. V., Almo, S.C., and Raushel, F.M. (2011). Three-Dimensional Structure and Catalytic Mechanism of Cytosine Deaminase. Biochemistry 50: 5077–5085

Müller, J., Oehler, S., and Müller-Hill, B. (1996). Repression of lac Promoter as a Function of Distance, Phase and Quality of an Auxiliary lac Operator. J. Mol. Biol. 257: 21–29.

Oehler, S., Amouyal, M., Kolkhof, P., von Wilcken-Bergmann, B., and Müller-Hill, B. (1994). Quality and position of the three lac operators of E. coli define efficiency of repression. EMBO J. 13: 3348–3355.

Ran, F.A., Hsu, P.D., Lin, C., Gootenberg, J.S., Konermann, S., Trevino, A.E., Scott, D. a, Inoue, A., Matoba, S., Zhang, Y., and Zhang, F. (2013). Double nicking by RNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell 154: 1380–9.

Ran, F.A., Hsu, P.D., Wright, J., Agarwala, V., Scott, D.A., and Zhang, F. (2013). Genome engineering using the CRISPR-Cas9 system. Nat. Protoc. 8: 2281–2308.

Sadler, J.R., Sasmor, H., and Betz, J.L. (1983). A perfectly symmetric lac operator binds the lac repressor very tightly. Proc. Natl. Acad. Sci. U. S. A. 80: 6785–9.

Simmon, V.F. and Lederberg, S. (1972). Degradation of bacteriophage lambda deoxyribonucleic acid after restriction by Escherichia coli K-12. J. Bacteriol. 112: 161–9.