Difference between revisions of "Team:Toronto/Protein-Modelling"

(Created page with "<html> <!-- ####################################################### --> <!-- # This html was produced by the igemwiki generator # --> <!-- # https://github.com/igemuoftATG...")
 
Line 22: Line 22:
 
<div class="container content-page row">
 
<div class="container content-page row">
 
<div class="block content">
 
<div class="block content">
 
+
<div id="introduction" class="subsection">
<!-- Subsection 1 -->
+
<h2 class="text-yellow">Introduction</h2>
<div id="content-red" class="subsection">
+
<h2 class="text-cyan"></h2>
+
 
<p>
 
<p>
Lorem is haha ipsum dolor sit amet consectetur adipisicing elit. Inventore maiores quibusdam, adipisci ipsum quisquam, aspernatur aperiam optio odit deleniti eaque <b>illum</b>nobis, <i>non </i>neque reprehenderit consequatur ipsam ullam perferendis magni.<sup><a href="#ref1">[1]</a></sup>
+
          LacILOV is a protein fusion of the N-terminal of <i>E. coli</i> LacI (1-58 residues) and a Light-Oxygen-Voltage (LOV) sensing domain. The Helix-Loop-Helix (HLH) motif of LacI, which functions as a DNA-binding domain, is fused to the N-terminal end of LOVII domain. This novel protein is released from the lac operon upon exposure to blue light.
 
</p>
 
</p>
<figure>
 
<div class="figures">
 
<div class="image"><img src="https://static.igem.org/mediawiki/2017/7/7d/T--Toronto--2017_workshop-2.jpg" alt="data"></div>
 
</div>
 
<figcaption>Students observing protein models in pyMOL.</figcaption>
 
</figure>
 
 
</div>
 
</div>
 
+
<div id="problem" class="subsection">
<!-- Subsection 2 -->
+
<h2 class="text-yellow">The Problem</h2>
 
+
<p>
<div id="content-cyan" class="subsection">
+
          In the lab, we used a Photo-Reporter assay to measure gene expression activity under the control of LacILOV. From a plate of colonies, a single colony was picked and placed into medium to be grown overnight in the dark. The colony was then diluted into six different tubes with fresh media, of which three tubes were left in the dark while the other three were exposed to light.  Induction starts, and the colony is grown for 12 hours before we begin measuring fluorescence. As determined by our assay, our LacILOV gene expression system requires stimulation with 12 hours of blue light to detect a difference in fluorescence between cultures grown in light compared to dark. This meant that the blue light had to be turned on for at least 12 hours in order for LacILOV to finally release from the <i>lac</i> operon. We identified a need engineer a better version of LacILOV that does not bind as strongly in order to fully optimize the function of our switch. We decided to approach this problem through computational protein modelling.
<h2 class="text-cyan">Ethics Workshop</h2>
+
        </p>
<p>After the coding portion of the day, students participated in 5 team activities designed to give students an understanding of our CRISPR/Cas9 project, the technical and ethical challenges of gene editing, and various synthetic biology topics.</p>
+
      </div>
 
+
<div id="pyrosetta" class="subsection">
 +
<h2 class="text-yellow">Using PyRosetta to model LacILOV</h2>
 +
<p>To model the structure of LacILOV, we used PyRosetta, a library that provides protein modelling <i>in silico</i>, to perform protein folding with our own custom scripts. PyRosetta allows for custom structure prediction with Rosetta sampling and scoring functions, such as for protein structure manipulation and energy calculations for running Monte Carlo-based simulations. <sup><a href="#ref2">[2]</a></sup> As a validation step, we wrote a script to predict for LOVII domain in which its crystal structure was already solved and published in Protein Data Bank (PDB) (PDB ID: 2V1A). We saw that the structure we generated using our scripts were not comparable to the crystal structures of the LOVII domain that is already published in the Protein Data Bank (PDB), the single database of storing information about the 3D structures of large biological molecules, including proteins. Since our scripts did not generate models comparable to the crystal structures, we decided to use a well-established pipeline for protein structure prediction server, I-TASSER, developed by the Zhang lab. <sup><a href="#ref3">[3]</a><a href="#ref4">[4]</a></sup>
 +
        </p>
 +
</div>
 +
      <div id="itasser" class="subsection">
 +
<h2 class="text-yellow">Using I-TASSER to model LacILOV</h2>
 +
<p>I-TASSER performs three main steps in structure prediction. For the first step, for a submitted amino acid sequence, I-TASSER uses threading to retrieve template proteins of similar folds from the PDB, to identify template structures that are structurally similar to the sequence. Threading works by aligning each amino acid from the submitted sequence to a position in a template structure, and assessing how well this amino acid fits the template. For the second step, fragments from threading-aligned regions are then taken from the template structures and assembled into full-length models using Monte Carlo-based simulations, while threading-unaligned regions of the sequence and any cases where no template structure is found are built by <i>ab initio</i> modelling. Structure assembly by Monte Carlo-based simulations is the most time-consuming step of I-TASSER, and it is guided by a composite energy function that has three different terms: a statistical energy term derived from experimentally-solved crystal structures, a template-based energy term from template structures in the PDB database, and the option of a user-specified restraint. This calculates the total energy for one structure. Thousands of different protein structures are generated from the same submitted sequence, which are then clustered based on structure similarity. For the third step, the cluster centroids, which are obtained by averaging the coordinates of all structures, then undergo structure re-assembly once more to refine local geometries and remove steric clashes. These structures are then clustered again and the lowest energy structures are selected, of which energy was calculated from the composite energy function described above. The final full-atomic models are then built by creating atomic details based on the selected structures through optimization of the hydrogen-bonding network. This completes the structure prediction process of a protein by I-TASSER.
 +
        </p>
 +
        <p>
 +
          To evaluate the accuracy of generated structure models, the I-TASSER C-score (confidence score) is calculated from the structure assembly simulations. The C-score is typically in the range of -5 to 2, where a higher C-score signifies a model with a high confidence. A benchmark test has also shown that protein structures have an accuracy at the residue level of an average error of less than 1.5 &#8491; compared with X-ray crystallography data with an I-TASSER C-score of less than -1.5. The I-TASSER server has also participated in CASP (Critical Assessment of Techniques for Protein Structure Prediction), a community-wide experiment that evaluates the efficacy of current techniques in protein structure prediction. In the past four CASPs, I-TASSER has consistently ranked first in the Server section of the competition <sup><a href="#ref5">[5]</a></sup>. In light of I-TASSER’s accuracy and performance when it comes to protein structure prediction compared to other techniques, our team decided to use I-TASSER to predict LacILOV’s protein structure.
 +
        </p>
 
<figure>
 
<figure>
<div class="figures">
+
  <div class="figures">
<div class="image"><img src="https://static.igem.org/mediawiki/2017/8/8c/T--Toronto--2017_engagement-2.jpg" alt="data"></div>
+
    <div id="container-01" class="mol-container"></div>
</div>
+
    <div id="container-02" class="mol-container"></div>
<figcaption>Students deep in discussion.</figcaption>
+
  </div>
 +
  <figcaption>Figure 1: I-TASSER predicted structure of LacILOV without mutations (left) compared to crystal structure (2V1A) of LOV II domain (right). Shows accuracy of I-TASSER for predicting 3D structure.</figcaption>
 
</figure>
 
</figure>
 
<p>The Python station reinforced concepts taught earlier on, while an RNA folding puzzle game was a platform for introducing basic mechanisms behind CRISPR and the idea of the “open laboratory,” or collaboration with the public as a method to tackle difficult challenges such as CRISPR off-target effects. In addition, a hands-on genetic circuit activity illustrated the mechanisms behind gene expression and feedback loops, which was followed by ethical case studies that provided an opportunity to discuss different perspectives and understand the societal effects of advancements in gene editing technology.</p>
 
<p> Mentors were available at each station to provide guidance, but students had to work together to solve each challenge in order to gain points. In particular, students performed well in the two case studies regarding gene drives and gene editing in Down’s syndrome patients, generating thoughtful answers in response to our discussion questions about the environmental, moral, and social impact of these potential applications of CRISPR technology. The winning team received 3D printed commemorative plaques to celebrate their success and enthusiasm in synthetic biology.</p>
 
 
</div>
 
</div>
 
+
      <div id="structures" class="subsection">
<!-- Subsection 3 -->
+
<h2 class="text-yellow">Important Structures in LacI and LOVII domain</h2>
 
+
<p>To determine the basis for LacILOV’s unexpected increased affinity for its promoter, we examined its two components and their structures, the DNA-binding domain of LacI and the LOVII domain. LacI’s function in LacILOV involves its DNA-binding region, the Helix-Loop-Helix motif. When LacI’s HLH motif is bound to DNA, it makes base- specific contacts in the major groove of the DNA, while &alpha;-helices known as “hinge helices” bind deeply in the minor groove. <sup><a href="#ref1">[1]</a></sup> Otherwise, when unbound, the HLH domain is unstructured. Meanwhile, the LOVII domain involves critical secondary structures that are implicated in this mechanism with light. The structure of the LOVII domain in the dark has a long 33 &#8491; amphipathic J&alpha; helix in its C-terminal region, which is anchored to the core LOV2 domain. Upon illumination to blue light, structural changes in the core domain are insignificant, however, prominent displacements in the middle part of the J&alpha; helix occur. <sup><a href="#ref6">[6]</a></sup> This unstructured J&alpha; helix is notably the primary influence in the overall conformation of the LOVII domain in the presence of blue light.
<div id="content-cyan" class="subsection">
+
        </p>
<h2 class="text-cyan">Engagement</h2>
+
        <p>Although the crystal structures of LacI and LOVII domain have been solved individually, the protein structure of our fused protein, LacILOV, has not yet been elucidated. Thus, we have decided to begin with predicting its protein structure using I-TASSER. Interestingly, based on our best model of LacILOV (with the highest C-score), we have noticed the formation of a novel &alpha;-helix that was spanning at the junction of the two components. We speculated that this particular &alpha;-helix was responsible for increasing the affinity of LacILOV for its promoter sequence, and thereby requiring a substantial amount of blue light exposure to release LacILOV from the promoter and to initiate the transcription of the downstream reporter gene.
<p>Upon students’ reflections on their experiences at the workshop, many reported that they had learned new information regarding bioinformatics, genetics, the ethical considerations of gene editing, as well as university programs. In addition, a student remarked that one of their key takeaways from the workshop was “learning about the real world applications of topics they had learned in school.</p>
+
        </p>
 
+
      </div>
 +
      <div id="foldit" class="subsection">
 +
<h2 class="text-yellow">Using Foldit Standalone to explore LacILOV</h2>
 +
        <p>To explore the predicted protein structure of LacILOV, we used Foldit Standalone which provides an interactive, 3D graphical interface that allows users to examine various biochemical properties of a protein structure, such as energy levels, side chains, and importantly hydrogen-bond interactions. Used for research, Foldit Standalone allows for manipulation of protein structures through configurable visualizations with the use of Rosetta molecular modeling package, and gives access to other Rosetta features such as its energy scoring and sampling functions and support for RosettaScripts. <sup><a href="#ref1">[7]</a></sup>
 +
        </p>
 +
        <p>Using Foldit Standalone, we have identified six residues within the &alpha;-helix that were forming H-bonds thereby contributing to the stability of the &alpha;-helix. Then, we have generated a series of single point mutations of the selected residues, including substitutions and deletions, and fed through I-TASSER to check if any of these point mutations changed the stability of the &alpha;-helix and if we could get any insights on the function of LacILOV. A library of mutant LacILOV sequences can be found here: [provide link to all the structures that will be uploaded].
 +
        </p>
 +
      </div>
 +
      <div id="mutated" class="subsection">
 +
<h2 class="text-yellow">Mutated LacILOV Sequences</h2>
 +
        <p>After assessing different permutations via I-TASSER, we have determined that G58 and T60-T61 are the key residues to the stabilization of the &alpha;-helix. When either G58 or T60-T61 was deleted from the LacILOV sequence, I-TASSER predicted a disrupted alpha-helix. Interesting, these deletions have also resulted in a more disordered DNA-binding domain. [Put the LacILOV figures]
 +
        </p>
 +
        <p>Collectively, we propose that G58 and T60-T61 contribute to the drastic increase in DNA-binding affinity of LacILOV. Our finding suggests that by deleting these key residues, we can engineer a more practical LacILOV that is more sensitive to light, weakening its binding strength, and therefore creating a more efficient system for a photo-reporter assay.
 +
        </p>
 
<figure>
 
<figure>
<div class="figures">
+
  <div class="figures">
<div class="image"><img src="https://static.igem.org/mediawiki/2017/c/c8/T--Toronto--2017_workshop-1.jpg" alt="data"></div>
+
    <div id="container-01" class="mol-container"></div>
</div>
+
    <div id="container-03" class="mol-container"></div>
<figcaption>Mentor helping students understand python code.</figcaption>
+
<div id="container-04" class="mol-container"></div>
 +
  </div>
 +
  <figcaption>Figure 2: I-TASSER predicted structure of LacILOV without mutations (left) compared to LacILOV with G58 deletion (middle) and LacILOV with T60-T61 deletions (right). Note the disruption of the &alpha;-helical structure.</figcaption>
 
</figure>
 
</figure>
 
<p>Students also commented that they found the introduction to bioinformatics and group activities engaging and informative for their future career choices. It was rewarding to see a high level of engagement in our students, and we hope to have inspired the next generation of synthetic biologists, scientists and engineers to see the world in a different way.</p>
 
 
</div>
 
</div>
 
+
      <div id="future" class="subsection">
<!-- Subsection : Reference -->
+
<h2 class="text-yellow">Future Directions</h2>
 
+
<p>In the future, with the knowledge of these important residues within LacILOV’s structure, we can modify its current sequence to generate new sequences that have either the deletion of residue 58, or the deletions of residues 60-61. We can then perform Photo -Reporter assays on these mutated sequences and compare the results to the original to validate that these mutations can actually improve sensitivity of LacILOV.  We can perform the Photo-Reporter assay to test out whether it is these residues that are critical to LacILOV stability.
 +
        </p>
 +
</div>
 +
      <div id="conclusion" class="subsection">
 +
<h2 class="text-yellow">Conclusion</h2>
 +
<p>Our use of computational modelling allows for the improvement and optimization of LacILOV, as we were able to gather information on which structures and residues are implicated in LacILOV’s function. We used computational tools such as I-TASSER and Foldit Standalone to achieve this. I-TASSER allowed us to predict a reliably accurate protein structure of LacILOV, which we were then able to examine for any significant changes in comparison to its original LacI and LOVII crystal structures that are part of the PDB. Upon identification of a novel &alpha;-helix that emerges from the HLH domain of LacI that is fused to LOVII, we used Foldit Standalone to directly manipulate LacILOV’s structure to observe which important residues contribute to the formation of this new &alpha;-helix. We then found that residues 58 and 60-61 were specifically implicated, as deletions of these residues disrupted this new &alpha;-helix. We now have mutated sequences of LacILOV that can be tested using assays to see whether it results in a better, more sensitive LacILOV that is less stable when bound to the promoter region. Overall, our use of computational tools allowed us to engineer a better version of LacILOV that is optimized for its critical function in our project’s light-activated CRISPR switch.
 +
        </p>
 +
</div>
 +
<div id="ref" class="subsection">
 +
<h2 class="text-yellow">References</h2>
 +
<ol>
 +
<li id="ref1">Schumacher MA, Choi KY, Zalkin H, Brennan, RG. 1994. Crystal structure of LacI member, PurR, bound to DNA: minor groove binding by alpha helices. Science. 266(5186): 763-770. doi.org/10.1126/science.7973627</li>
 +
          <li id="ref2">Chaudhury S, Lyskov S, Gray JJ. 2010. PyRosetta: a script-based interface for implementing molecular modelling algorithms using Rosetta. Bioinformatics. 26(5): 689–691. doi.org/10.1093/bioinformatics/btq007</li>
 +
          <li id="ref3">Y Zhang. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics, 9: 40 (2008). doi: 10.1186/1471-2105-9-40.</li>
 +
          <li id="ref4">J Yang, R Yan, A Roy, D Xu, J Poisson, Y Zhang. The I-TASSER Suite: Protein structure and function prediction. Nature Methods, 12: 7-8 (2015). doi:10.1038/nmeth.3213</li>
 +
          <li id="ref5">Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. 2015. The I-TASSER Suite: protein structure and function prediction. Nature Methods. 12(1): 7-8. doi.org/10.1038/nmeth.3213</li>
 +
          <li id="ref6">Halavaty AS, Moffat K. 2007. N- and C-terminal flanking regions modulate light-induced signal transduction in the LOV2 domain of the blue light sensor phototropic 1 from Avena sativa. Biochemistry. 46(49): 14001-14009. doi.org/10.1021/bi701543e</li>
 +
          <li id="ref7">Kleffner R, Flatten J, Leaver-Fay A, Baker D, Siegel JB, Khatib F, Cooper S. 2017. Foldit Standalone: a video game-derived protein structure manipulation interface using Rosetta. Bioinformatics. 33(17): 2765-2767. doi.org/10.1093/bioinformatics/btx283</li>
 +
</ol>
 +
</div>
 
</div>
 
</div>
 
<div class="block sidebar">
 
<div class="block sidebar">

Revision as of 18:53, 1 November 2017

Protein Modelling

Introduction

LacILOV is a protein fusion of the N-terminal of E. coli LacI (1-58 residues) and a Light-Oxygen-Voltage (LOV) sensing domain. The Helix-Loop-Helix (HLH) motif of LacI, which functions as a DNA-binding domain, is fused to the N-terminal end of LOVII domain. This novel protein is released from the lac operon upon exposure to blue light.

The Problem

In the lab, we used a Photo-Reporter assay to measure gene expression activity under the control of LacILOV. From a plate of colonies, a single colony was picked and placed into medium to be grown overnight in the dark. The colony was then diluted into six different tubes with fresh media, of which three tubes were left in the dark while the other three were exposed to light. Induction starts, and the colony is grown for 12 hours before we begin measuring fluorescence. As determined by our assay, our LacILOV gene expression system requires stimulation with 12 hours of blue light to detect a difference in fluorescence between cultures grown in light compared to dark. This meant that the blue light had to be turned on for at least 12 hours in order for LacILOV to finally release from the lac operon. We identified a need engineer a better version of LacILOV that does not bind as strongly in order to fully optimize the function of our switch. We decided to approach this problem through computational protein modelling.

Using PyRosetta to model LacILOV

To model the structure of LacILOV, we used PyRosetta, a library that provides protein modelling in silico, to perform protein folding with our own custom scripts. PyRosetta allows for custom structure prediction with Rosetta sampling and scoring functions, such as for protein structure manipulation and energy calculations for running Monte Carlo-based simulations. [2] As a validation step, we wrote a script to predict for LOVII domain in which its crystal structure was already solved and published in Protein Data Bank (PDB) (PDB ID: 2V1A). We saw that the structure we generated using our scripts were not comparable to the crystal structures of the LOVII domain that is already published in the Protein Data Bank (PDB), the single database of storing information about the 3D structures of large biological molecules, including proteins. Since our scripts did not generate models comparable to the crystal structures, we decided to use a well-established pipeline for protein structure prediction server, I-TASSER, developed by the Zhang lab. [3][4]

Using I-TASSER to model LacILOV

I-TASSER performs three main steps in structure prediction. For the first step, for a submitted amino acid sequence, I-TASSER uses threading to retrieve template proteins of similar folds from the PDB, to identify template structures that are structurally similar to the sequence. Threading works by aligning each amino acid from the submitted sequence to a position in a template structure, and assessing how well this amino acid fits the template. For the second step, fragments from threading-aligned regions are then taken from the template structures and assembled into full-length models using Monte Carlo-based simulations, while threading-unaligned regions of the sequence and any cases where no template structure is found are built by ab initio modelling. Structure assembly by Monte Carlo-based simulations is the most time-consuming step of I-TASSER, and it is guided by a composite energy function that has three different terms: a statistical energy term derived from experimentally-solved crystal structures, a template-based energy term from template structures in the PDB database, and the option of a user-specified restraint. This calculates the total energy for one structure. Thousands of different protein structures are generated from the same submitted sequence, which are then clustered based on structure similarity. For the third step, the cluster centroids, which are obtained by averaging the coordinates of all structures, then undergo structure re-assembly once more to refine local geometries and remove steric clashes. These structures are then clustered again and the lowest energy structures are selected, of which energy was calculated from the composite energy function described above. The final full-atomic models are then built by creating atomic details based on the selected structures through optimization of the hydrogen-bonding network. This completes the structure prediction process of a protein by I-TASSER.

To evaluate the accuracy of generated structure models, the I-TASSER C-score (confidence score) is calculated from the structure assembly simulations. The C-score is typically in the range of -5 to 2, where a higher C-score signifies a model with a high confidence. A benchmark test has also shown that protein structures have an accuracy at the residue level of an average error of less than 1.5 Å compared with X-ray crystallography data with an I-TASSER C-score of less than -1.5. The I-TASSER server has also participated in CASP (Critical Assessment of Techniques for Protein Structure Prediction), a community-wide experiment that evaluates the efficacy of current techniques in protein structure prediction. In the past four CASPs, I-TASSER has consistently ranked first in the Server section of the competition [5]. In light of I-TASSER’s accuracy and performance when it comes to protein structure prediction compared to other techniques, our team decided to use I-TASSER to predict LacILOV’s protein structure.

Figure 1: I-TASSER predicted structure of LacILOV without mutations (left) compared to crystal structure (2V1A) of LOV II domain (right). Shows accuracy of I-TASSER for predicting 3D structure.

Important Structures in LacI and LOVII domain

To determine the basis for LacILOV’s unexpected increased affinity for its promoter, we examined its two components and their structures, the DNA-binding domain of LacI and the LOVII domain. LacI’s function in LacILOV involves its DNA-binding region, the Helix-Loop-Helix motif. When LacI’s HLH motif is bound to DNA, it makes base- specific contacts in the major groove of the DNA, while α-helices known as “hinge helices” bind deeply in the minor groove. [1] Otherwise, when unbound, the HLH domain is unstructured. Meanwhile, the LOVII domain involves critical secondary structures that are implicated in this mechanism with light. The structure of the LOVII domain in the dark has a long 33 Å amphipathic Jα helix in its C-terminal region, which is anchored to the core LOV2 domain. Upon illumination to blue light, structural changes in the core domain are insignificant, however, prominent displacements in the middle part of the Jα helix occur. [6] This unstructured Jα helix is notably the primary influence in the overall conformation of the LOVII domain in the presence of blue light.

Although the crystal structures of LacI and LOVII domain have been solved individually, the protein structure of our fused protein, LacILOV, has not yet been elucidated. Thus, we have decided to begin with predicting its protein structure using I-TASSER. Interestingly, based on our best model of LacILOV (with the highest C-score), we have noticed the formation of a novel α-helix that was spanning at the junction of the two components. We speculated that this particular α-helix was responsible for increasing the affinity of LacILOV for its promoter sequence, and thereby requiring a substantial amount of blue light exposure to release LacILOV from the promoter and to initiate the transcription of the downstream reporter gene.

Using Foldit Standalone to explore LacILOV

To explore the predicted protein structure of LacILOV, we used Foldit Standalone which provides an interactive, 3D graphical interface that allows users to examine various biochemical properties of a protein structure, such as energy levels, side chains, and importantly hydrogen-bond interactions. Used for research, Foldit Standalone allows for manipulation of protein structures through configurable visualizations with the use of Rosetta molecular modeling package, and gives access to other Rosetta features such as its energy scoring and sampling functions and support for RosettaScripts. [7]

Using Foldit Standalone, we have identified six residues within the α-helix that were forming H-bonds thereby contributing to the stability of the α-helix. Then, we have generated a series of single point mutations of the selected residues, including substitutions and deletions, and fed through I-TASSER to check if any of these point mutations changed the stability of the α-helix and if we could get any insights on the function of LacILOV. A library of mutant LacILOV sequences can be found here: [provide link to all the structures that will be uploaded].

Mutated LacILOV Sequences

After assessing different permutations via I-TASSER, we have determined that G58 and T60-T61 are the key residues to the stabilization of the α-helix. When either G58 or T60-T61 was deleted from the LacILOV sequence, I-TASSER predicted a disrupted alpha-helix. Interesting, these deletions have also resulted in a more disordered DNA-binding domain. [Put the LacILOV figures]

Collectively, we propose that G58 and T60-T61 contribute to the drastic increase in DNA-binding affinity of LacILOV. Our finding suggests that by deleting these key residues, we can engineer a more practical LacILOV that is more sensitive to light, weakening its binding strength, and therefore creating a more efficient system for a photo-reporter assay.

Figure 2: I-TASSER predicted structure of LacILOV without mutations (left) compared to LacILOV with G58 deletion (middle) and LacILOV with T60-T61 deletions (right). Note the disruption of the α-helical structure.

Future Directions

In the future, with the knowledge of these important residues within LacILOV’s structure, we can modify its current sequence to generate new sequences that have either the deletion of residue 58, or the deletions of residues 60-61. We can then perform Photo -Reporter assays on these mutated sequences and compare the results to the original to validate that these mutations can actually improve sensitivity of LacILOV. We can perform the Photo-Reporter assay to test out whether it is these residues that are critical to LacILOV stability.

Conclusion

Our use of computational modelling allows for the improvement and optimization of LacILOV, as we were able to gather information on which structures and residues are implicated in LacILOV’s function. We used computational tools such as I-TASSER and Foldit Standalone to achieve this. I-TASSER allowed us to predict a reliably accurate protein structure of LacILOV, which we were then able to examine for any significant changes in comparison to its original LacI and LOVII crystal structures that are part of the PDB. Upon identification of a novel α-helix that emerges from the HLH domain of LacI that is fused to LOVII, we used Foldit Standalone to directly manipulate LacILOV’s structure to observe which important residues contribute to the formation of this new α-helix. We then found that residues 58 and 60-61 were specifically implicated, as deletions of these residues disrupted this new α-helix. We now have mutated sequences of LacILOV that can be tested using assays to see whether it results in a better, more sensitive LacILOV that is less stable when bound to the promoter region. Overall, our use of computational tools allowed us to engineer a better version of LacILOV that is optimized for its critical function in our project’s light-activated CRISPR switch.

References

  1. Schumacher MA, Choi KY, Zalkin H, Brennan, RG. 1994. Crystal structure of LacI member, PurR, bound to DNA: minor groove binding by alpha helices. Science. 266(5186): 763-770. doi.org/10.1126/science.7973627
  2. Chaudhury S, Lyskov S, Gray JJ. 2010. PyRosetta: a script-based interface for implementing molecular modelling algorithms using Rosetta. Bioinformatics. 26(5): 689–691. doi.org/10.1093/bioinformatics/btq007
  3. Y Zhang. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics, 9: 40 (2008). doi: 10.1186/1471-2105-9-40.
  4. J Yang, R Yan, A Roy, D Xu, J Poisson, Y Zhang. The I-TASSER Suite: Protein structure and function prediction. Nature Methods, 12: 7-8 (2015). doi:10.1038/nmeth.3213
  5. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. 2015. The I-TASSER Suite: protein structure and function prediction. Nature Methods. 12(1): 7-8. doi.org/10.1038/nmeth.3213
  6. Halavaty AS, Moffat K. 2007. N- and C-terminal flanking regions modulate light-induced signal transduction in the LOV2 domain of the blue light sensor phototropic 1 from Avena sativa. Biochemistry. 46(49): 14001-14009. doi.org/10.1021/bi701543e
  7. Kleffner R, Flatten J, Leaver-Fay A, Baker D, Siegel JB, Khatib F, Cooper S. 2017. Foldit Standalone: a video game-derived protein structure manipulation interface using Rosetta. Bioinformatics. 33(17): 2765-2767. doi.org/10.1093/bioinformatics/btx283