|
|
Line 131: |
Line 131: |
| <div class="third double"> | | <div class="third double"> |
| <div class ="article"> | | <div class ="article"> |
− | As part of our iGEM project, we are faced with the challenge of adapting the tRNA synthetase to non-canonical amino acids. For this purpose, modelled possible candidates for synthetases as a preparation for carrying out a positive-negative selection according to Schulz [] in the laboratory. | + | As part of our iGEM project, we are faced with the challenge of adapting the tRNA synthetase to non-canonical amino acids. For this purpose, modelled possible candidates for synthetases as a preparation for carrying out a positive-negative selection according to (Liu <i>et al.</i>, 2007) in the laboratory. |
| | | |
| Due to the rapid development in the field of protein and molecular structure analysis, there has been an increase in the availability of molecular 3D structure data. These data are organized in publicly available databases which provide a foundation for the modeling and simulation of chemical-biological processes in bioinformatics. As our non-canonical amino acid has been synthetized by ourselves, no such comprehensive information is available, yet. However, information of similarly structured amino acids can potentially serve as a basis for our modeling. | | Due to the rapid development in the field of protein and molecular structure analysis, there has been an increase in the availability of molecular 3D structure data. These data are organized in publicly available databases which provide a foundation for the modeling and simulation of chemical-biological processes in bioinformatics. As our non-canonical amino acid has been synthetized by ourselves, no such comprehensive information is available, yet. However, information of similarly structured amino acids can potentially serve as a basis for our modeling. |
Line 144: |
Line 144: |
| <h4>Method</h4> | | <h4>Method</h4> |
| <div class ="article"> | | <div class ="article"> |
− | We used the open-source software "Rosetta" for the main part of our modeling project, which was introduced at the University of Washington by David Baker in 1997, initially in the context of protein structure prediction. Since then, Rosetta has grown to include numerous modules and is currently widely used in research. In our application, we focus on the Rosetta module called the "Rosetta Enzyme Design Protocol" | + | We used the open-source software <a href="https://www.rosettacommons.org/docs/latest/getting_started/Getting-Started">"Rosetta"</a> for the main part of our modeling project, which was introduced at the University of Washington by David Baker in 1997 (Simon <i>et al.</i>,1997), initially in the context of protein structure prediction. Since then, Rosetta has grown to include numerous modules and is currently widely used in research. In our application, we focus on the Rosetta module called the "Rosetta Enzyme Design Protocol" |
| | | |
| </div> | | </div> |
Line 199: |
Line 199: |
| <article class="hidden-block"> | | <article class="hidden-block"> |
| <ul> | | <ul> |
− | <li> “.params”-file: </br> | + | <li> <a href="https://www.rosettacommons.org/demos/latest/tutorials/prepare_ligand/prepare_ligand_tutorial"> </a>“.params”-file: </br> |
− | A conformer ensemble has to be generated using information about the ligand, as the non-canonical amino acids are not generally available in databases like PDB, making it necessary to build them manually using tools like pymol, Avogadro or Chemdraw. Using these tools, files can be saved in the desired format. The ligand needs to be specified in the “.sdf”, “.mol” or “.mol2” file format. Such a file can be obtained automatically by converting the relevant information from a “.pdb” file, if available. This conversion process usually also involves augmenting the data with hydrogen atoms in case they are missing from the “.pdb” file. Alternatively, the ligand can be designed using SMILES or manually using tools such as Avogadro, as we did. In the next step, the ligand file is used to create a conformer ensemble that is in turn used to create a Rosetta parameter (“.params”) file. In addition to the specific names of all atoms present in the ligand, this parameter file also stores all bonds between the individual atoms, including the binding angles and binding distances. Rosetta cannot generate the conformer ensemble by itself, so an additional tool is needed. Different tools are capable of creating the conformer ensemble automatically, but it is best to manually define constraints for the chi1, chi2 and backbone psi torsion angles that define the orientation of the ligand in the binding pocket. For this, we know of three tools: The first is OpenEye Omega, but the full license is very costly and the free version is hard to obtain. The second tool is Accelrys Discovery Studio, but Accerlys does not provide a free license. The third tool is TINKER, which is free, but poorly documented and depends on a specific keyfile, which requires a high amount of chemical expertise to generate. Conformers might also be generated without constrains, for which different tools are available, in our case, we used ConFlex. Conformers need to be stored in one file (“.sdf”, “.mol”, or “.mol2”). | + | A conformer ensemble has to be generated using information about the ligand, as the non-canonical amino acids are not generally available in databases like <a href=""http://www.rcsb.org/pdb/home/home.do>PDB</a>, making it necessary to build them manually |
| + | using tools like <a href="https://pymol.org/2/">pymol</a><a href="https://avogadro.cc/">Avogadro</a> or <a href="http://www.cambridgesoft.com/software/overview.aspx">Chemdraw</a>. Using these tools, files can be saved in the desired format. The ligand needs to be specified in the “.sdf”, “.mol” or “.mol2” file format. Such a |
| + | file can be obtained automatically by converting the relevant information from a “.pdb” file, if available. This conversion process usually also involves augmenting the data with hydrogen atoms |
| + | in case they are missing from the “.pdb” file. Alternatively, the ligand can be designed using SMILES or manually using tools such as Avogadro, as we did. In the next step, the ligand file is used |
| + | to create a conformer ensemble that is in turn used to create a Rosetta parameter (“.params”) file. In addition to the specific names of all atoms present in the ligand, this parameter file also |
| + | stores all bonds between the individual atoms, including the binding angles and binding distances. Rosetta cannot generate the conformer ensemble by itself, so an additional tool is needed. |
| + | Different tools are capable of creating the conformer ensemble automatically, but it is best to manually define constraints for the chi1, chi2 and backbone psi torsion angles that define the |
| + | orientation of the ligand in the binding pocket. For this, we know of three tools: The first is OpenEye Omega, but the full license is very costly and the free version is hard to obtain. |
| + | The second tool is Accelrys Discovery Studio, but Accerlys does not provide a free license. The third tool is TINKER, which is free, but poorly documented and depends on a specific keyfile, |
| + | which requires a high amount of chemical expertise to generate. Conformers might also be generated without constrains, for which different tools are available, in our case, we used ConFlex. |
| + | Conformers need to be stored in one file (“.sdf”, “.mol”, or “.mol2”). |
| <li> “.pdb”-file: </br> | | <li> “.pdb”-file: </br> |
− | The input-file for the scaffold, in our case the tRNA synthetase, can be downloaded in PDB format from Protein Data Bank (PDB). It is then necessary to delete the natural ligand from the PDB-file, as we need to incorporate our own aaRS. and,Additionally, it is advised to relax the preferably, the structure should be relaxedin order to allow for flexibility with regards to the simulation outcomes. For further details, see the (documentation: https://www.rosettacommons.org/docs/latest/application_documentation/structure_prediction/relax.) | + | The input-file for the scaffold, in our case the tRNA synthetase, can be downloaded in PDB format from Protein Data Bank (PDB). It is then necessary to delete the natural ligand from the PDB-file, |
| + | as we need to incorporate our own aaRS. and,Additionally, it is advised to relax the preferably, the structure should be relaxedin order to allow for flexibility with regards to the simulation outcomes. |
| + | For further details, see the (documentation: https://www.rosettacommons.org/docs/latest/application_documentation/structure_prediction/relax.) |
| <li> “.cst”-file: </br> | | <li> “.cst”-file: </br> |
− | The .cst-file defines the potential hydrogen bonds between the ligand and the amino acid. For example, the code block characterized by the tags “CST::BEGIN” and “CST::END”, specifies the orientation or catalytic function of the enzyme. </br> | + | The .cst-file defines the potential hydrogen bonds between the ligand and the amino acid. For example, the code block characterized by the tags “CST::BEGIN” and “CST::END”, specifies the orientation or |
− | More specifically, the first record of the block begins with “TEMPLATE::ATOM_MAP”, followed by either “atom_name” or “atom_type”, depending on whether a specific residue or a specific type of residue is provided. In the latter case, it is not important to choose specific atoms. Instead, a catalytic residue of the amino acid such as “OH” or “Nhis” is specified. The next lines of the TEMPLATE::ATOM_MAP record define the residues using one-letter or three-letter-codes that are prefixed by “residue1” or “residue3”, respectively. | + | catalytic function of the enzyme. </br> |
− | The second record, beginning with the tag “CONSTRAINT”, contains all relevant distance, angle and torsion constraints for the matching. Each constraint is described with five parameters. In the case of the distance constraint, the first parameter describes the optimal distance “x0” between the chosen residues, the second parameter describes the tolerance “xtol”, the third parameter defines the strength “k” and the fourth parameter specifies the type of bond (1 for a covalent bond, 0 otherwise). If the modulus of the difference between the actual distance “x” and the specified optimal distance is smaller than the tolerance, then the penality score is zero. Otherwise, the constraint consists of the term | + | More specifically, the first record of the block begins with “TEMPLATE::ATOM_MAP”, followed by either “atom_name” or “atom_type”, depending on whether a specific residue or a specific type of residue |
| + | is provided. In the latter case, it is not important to choose specific atoms. Instead, a catalytic residue of the amino acid such as “OH” or “Nhis” is specified. The next lines of the TEMPLATE::ATOM_MAP |
| + | record define the residues using one-letter or three-letter-codes that are prefixed by “residue1” or “residue3”, respectively. |
| + | The second record, beginning with the tag “CONSTRAINT”, contains all relevant distance, angle and torsion constraints for the matching. Each constraint is described with five parameters. |
| + | In the case of the distance constraint, the first parameter describes the optimal distance “x0” between the chosen residues, the second parameter describes the tolerance “xtol”, |
| + | the third parameter defines the strength “k” and the fourth parameter specifies the type of bond (1 for a covalent bond, 0 otherwise). If the modulus of the difference between the actual distance “x” |
| + | and the specified optimal distance is smaller than the tolerance, then the penality score is zero. Otherwise, the constraint consists of the term |
| k* ( |x - x0| - xtol ) | | k* ( |x - x0| - xtol ) |
| to the penality score. For the angle and torsion constraints, the description is similar. | | to the penality score. For the angle and torsion constraints, the description is similar. |
| If necessary, additional hydrogen bonds to other atoms of the ligand are specified in terms of additional blocks, using the tag “VARIABLE::CST”. | | If necessary, additional hydrogen bonds to other atoms of the ligand are specified in terms of additional blocks, using the tag “VARIABLE::CST”. |
− | Finally, most of the blocks described above can be optionally followed by an “ALGORITHM_INFO” record that stores details of the matching algorithm by parameter values. We refer to the Rosetta documentation for further details. | + | Finally, most of the blocks described above can be optionally followed by an “ALGORITHM_INFO” record that stores details of the matching algorithm by parameter values. |
| + | We refer to the Rosetta documentation for further details. |
| <li>”.pos”-file: </br> | | <li>”.pos”-file: </br> |
| The “.pos” file contains the allowed locations in the scaffold for the chosen catalytic residues in each constraint block of the “.cst” file. | | The “.pos” file contains the allowed locations in the scaffold for the chosen catalytic residues in each constraint block of the “.cst” file. |
Line 355: |
Line 374: |
| <h3> References </h3> | | <h3> References </h3> |
| | | |
− | <!--<b>Liu, W., Brock, A., Chen, S., Chen, S., & Schultz, P. G. </b>,(2007). Genetic incorporation of unnatural amino acids into proteins in mammalian cells. Nature methods,<b> 4(3)</b>, 239-244.--> | + | <b>Liu, W., Brock, A., Chen, S., Chen, S., Schultz, P. G. </b>,(2007). Genetic incorporation of unnatural amino acids into proteins in mammalian cells. Nature methods,<b> 4(3)</b>, 239-244.<br> |
− | <b>Richter, F., Leaver-Fay, A., Khare, S. D., Bjelic, S., Baker, D. </b>(2011). De novo enzyme design using Rosetta3. PloS one,<b> 6(5)</b>: e19230. | + | <b>Richter, F., Leaver-Fay, A., Khare, S. D., Bjelic, S., Baker, D. </b>(2011). De novo enzyme design using Rosetta3. PloS one,<b> 6(5)</b>: e19230.<br> |
| + | <b>Simons, K. T., Kooperberg, C., Huang, E., Baker, D.</b> (1997). Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. Journal of molecular biology, <b>268(1)</b>, 209-225.<br> |
| </div> | | </div> |
| <div class="bevel bl"></div> | | <div class="bevel bl"></div> |
References
Liu, W., Brock, A., Chen, S., Chen, S., Schultz, P. G. ,(2007). Genetic incorporation of unnatural amino acids into proteins in mammalian cells. Nature methods, 4(3), 239-244.
Richter, F., Leaver-Fay, A., Khare, S. D., Bjelic, S., Baker, D. (2011). De novo enzyme design using Rosetta3. PloS one, 6(5): e19230.
Simons, K. T., Kooperberg, C., Huang, E., Baker, D. (1997). Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. Journal of molecular biology, 268(1), 209-225.