The design step applies an algorithm such that the binding pocket and the near environment are mutated and the remaining scaffold is repacked. Additionally, a badness-of-fit score is generated which indicates how well the mutation fits the amino acid. For every file from the matching step, a model with a score and a “.pdb-file” will be generated, specifying where the sequence can be located, and the 3D-structure can be analyzed. Notably, the amino acid structure can be extracted separately.
The following section describes the structure of the design step. For further details on each step, click the technical details button.
1. Optimizing the catalytic interactions
For the first alternative, the file can be generated either by the Rosetta standard or a manually created .”res”- file. For more details, we refer to the Rosetta documentation. (link:https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/d1/d97/resfiles.html).
For the latter alternative, residues are automatically categorized by their location of the Calpha.
TECHNICAL DETAILS
- residues that have their Calpha within a distance cut1 angstroms of any ligand heavyatom will be set to designable
- res that have Calpha within a distance cut2 of any ligand heavyatom and the Cbeta closer to that ligand atom than the Calpha will be set to designable. cut2 has to be larger than cut1
- res that have Calpha within a certain distance cut3 of any ligand heavyatom will be set to repackable. cut3 has to be larger than cut2
- res that have Calpha within a distance cut4 of any ligand heavy atom and the Cbeta closer to that ligand atom will be set to repackable. cut4 has to be larger than cut3
- all residues not in any of the above 4 groups are kept static.
2. Cycles of sequence design and minimazation within constrains
To optimize the structure we used applied an iterative optimization algorithm. This algorithm mutates all residues from the backbone, which are not part of the catalytic center, to alanine, and a small energy function refraction will place the ligand in an optimal position to the backbone.
For this approach, bb_min and chi_min allow for backbone flexibility and the rotation of the torsions. An alternative for this minimization step is the Monte Carlo rigid body ligand sampling. For further information on this method, we refer to the ROSETTA documentation (https://www.rosettacommons.org/manuals/archive/rosetta3.5_user_guide/d6/dbc/enzyme_design.html).
Design step inputs
The following input files are relevant for the design procedure:
- “.pdb”-file generated in the matching step
- “.cst”-file for the ligand
- “.params”-file for the ligand and the scaffold
- “.flags” to coordinate the inputs
For further information on these files, please refer to step 2 above.
Design step outputs
The output for the design step is a “.pdb”-file containing the mutated scaffold and a “.score”-file.
For every PDB-file, a line in the score-file is generated, so it is easy to evaluate the given structure.
The first score in the file is the total score of the model. After that, the number of hydrogen bonds in the protein as a whole and in the constraints is listed, followed by the number of dismissed polars in the catalytic residues as well in the whole protein and in the constraints.
See the technical details below for a full overview of the output information
TECHNICAL DETAILS
total_score: energy (excluding the constraint energy)
fa_rep: full atom repulsive energy
hbond_sc: hbond sidechain energy
all_cst: all constraint energy
tot_pstat_pm: pack statistics, 0-1, 1 = fully packed
total_nlpstat_pm: pack statistics withouth the ligand present
tot_burunsat_pm: buried unsatisfied polar residues, higher = more buried unsat polars (just a count)
tot_hbond_pm: total number of hbonds
tot_NLconst_pm: total number of non-local contacts ( two residues form a nonlocal
contact if they are farther than 8 residues apart in sequence but interact with a Rosetta score of lower than -1.0 )
We choose our synthetases because of a good total score and a good ligand score. We checked the corresponding PDB-files, and rated the ligand and the binding pocket as satisfying, so that the ligand assumedly does not collide with residues in the near environment.
The total scores for CBT are not as good as the scores for NPA. However, the ligand scores are acceptable in both cases. A visual evaluation confirms that the ligand fits into the binding pocket.
Our results for this step
We used this algorithm to simulate the evolution of the tyrosyl-tRNA with the amino acids Nitrophenylalanine and CBT-ASP.
NPA simulation:
We created one .cst-file-block for the nitrogroup of NPA. Since there are two oxygen-atoms in the nitrogroup, we defined two atom nametags. As several possibilities are useful, we defined two possible constraint partners for the hydrogen bonds. The first is asparagine (N) or glutamine (Q) and the second is glycine (G). We set the possible distance to 2.8 A, as it is the optimal distance for hydrogenbonds, and a tolerance level of 0.5 A. We set the angles to 120° with a tolerance of 40°, as recommended by Florian Richter during our talk in cologne. The torsion angles were set to 180° with a tolerance of 180° and a penalty of 0, such that the torsion angles can rotate completely freely.
CBT-ASP simulation:
CBT-ASP can build hydrogen bonds in two ways. The first is a weak hydrogen bond on the sulphur atom and the other possibility is a normal hydrogen bond on the nitrogen (N2) after the C-gamma. We wrote three cst-files, one for a possible bond with sulpur, one for a possible bond with nitrogen, and one for both bonds. As possible corresponding amino acids, we chose serine, threonine, tyrosine, asparagine, glutamine, and glycine.
It is recommended to write a “.flags”-file, because there are several input- parameters to be defined, but it is also possible to define them via console user interface.
For the categorization of the scaffold, we chose the automatic determination and set the following cuts: cut1: 6 A, cut2: 8 A, cut3: 10 A and cut4: 12 A, like the baker-lab commonly used.