Team:Bielefeld-CeBiTec/Model

Modeling

Organization of our modeling projects

On this page, we describe our main modeling project, which was integral for our whole project. However, besides this complex modeling, we also conducted and applied several straight-forward stochastic and statistical models to support and guide our laboratory work. These modeling projects are briefly described here; for further information, please check the corresponding link leading to the part of our project the model has been an element of.
Discriminant function model for the ICG prediction:
We conducted a discriminant function analysis for the recognition of which base – natural or unnatural – is present at a specified position of a base sequence. This model is part of our ICG model software, found here.
Calculation of an effective library size for the selection system:
We used a combination of combinatorics and statistics to calculate the optimal library size for the selection process, such that it is expected to contain all possible sequence mutations, and therefore easily all possible resulting amino acids, at least once. This calculation is part of the translational system, found here.
Comparison of mRFP production for the positive selection system (BBa_K2201373):
We modeled and visually compared the mRFP production over time for the normal signaling and the enhanced signaling circuit of the positive selection system. The system and plot can be found here.

Short Summary

As our project explores possibilities of an expanded genetic code via unnatural bases and non-canonical amino acids, we set out to complement our lab work via modeling of novel amino acyl tRNA synthetases (aaRS) for a non-canonical amino acids we synthetized in the lab. In order to incorporate non-canonical amino acids into proteins via the translational process, the aaRS has to attach the amino acid to the respective tRNA. Thus, we designed aaRS sequences which were meant to link our own non-canonical amino acid to a fitting tRNA. As a result, we obtained a couple of sequences of possible aaRS candidates, which we evaluated, based on a ROSETTA score, and ordered via gene synthesis. n practice, our modeling consisted of the following steps:

Step	Software/Method	Meaning
1. Ligand Preparation	Manually via Avogadro	Due to the novelty of our amino acid, no information on the ligand is available in databases. Therefore, all information has to be provided manually and then generate a conformer ensemble, containing for example all energetically useful arrangements of atoms within the molecule.
2. Scaffold categorization	ROSETTA protocol	The scaffold describes the rough layout of the synthetase. We downloaded the scaffold 1j1u, the aaRS of Methalonococcus janischii as a template, and then relaxed its structure to improve the outcome of the ROSETTA algorithm.
3. Set simulation constrains	Manually via ROSETTA	Constrains with regards to possible mutations of the synthetase ensure that the generated sequences fit to the amino acid. For example, we constrained the distance between certain atoms and their angle to a range optimal for hydrogen bonds.
4. Enzyme Matching	ROSETTA protocol	ROSETTA combines information about the ligand and constrains to find possible hydrogen bonding partners and propose the shape of the scaffold within the set constraints.
5. Enzyme Design	ROSETTA protocol	An algorithm uses the information from the previous step and information on the ligand to simulate the mutation process and generate sequences for optimized scaffolds with corresponding scores as measures of fit.
6. Evaluate results in silico	Manually	We evaluate the visual output and the score values and order the sequences with the most promising results via gene synthesis.
7. Evaluate results in vivo	Manually	The synthetases are validated in the lab with the corresponding ncAA via a positive-negative selection system.

As a result, we obtained a couple of sequences of possible aaRS candidates, which we evaluated, based on a ROSETTA score, and ordered via gene synthesis. Figure A describes our modeling project as a whole

Introduction

Overview

As part of our iGEM project, we are faced with the challenge of adapting the tRNA synthetase to non-canonical amino acids. For this purpose, modelled possible candidates for synthetases as a preparation for carrying out a positive-negative selection according to Schulz [] in the laboratory. Due to the rapid development in the field of protein and molecular structure analysis, there has been an increase in the availability of molecular 3D structure data. These data are organized in publicly available databases which provide a foundation for the modeling and simulation of chemical-biological processes in bioinformatics. As our non-canonical amino acid has been synthetized by ourselves, no such comprehensive information is available, yet. However, information of similarly structured amino acids can potentially serve as a basis for our modeling. As evaluating an expanded genetic code is a complex task, the practical laboratory work of our project is supplemented by a theoretical approach, involving modeling, simulation, and evaluation on the computerin silico. Specifically, we focused on simulation to designaimed at designing an aaRS tRNA synthetase for the new non-canonical amino acid CBT-ASP. Additionally to CBT, we also simulated the evolution process for the non-canonical amino acid NPA as a validation of our modeling procedure, altough as synthases for this ncAA are known and thus comparable to our in silico result, we can evaluate our modeling procedure. (Vielleicht hier ein wenig schöner) For this purposeOur core challenge was to evolve, the binding pocket must be evolved in a manner which effectively charges the tRNA with the amino acid, thus also recognizing this amino acid specifically.

Method

We used the open-source software "Rosetta" for the main part of our modeling project, which was introduced at the University of Washington by David Baker in 1997, initially in the context of protein structure prediction. Since then, Rosetta has grown to include numerous modules and is currently widely used in research. In our application, we focus on the Rosetta module called the "Rosetta Enzyme Design Protocol"

ROSETTA Enzyme Design

Overview

Since the non-canonical amino acid synthesized in the laboratory is completely novel, there is no corresponding tRNA synthetase which can load the tRNA, yet. For this reason, we use the enzyme design protocol to design the binding pocket in a way that allows it to form an effective and specific enzyme. The protocol consists of two main steps: matching and designing. The enzyme design algorithm basically is summarized in Fig. B

Figure (2): Flowchart Enzym Design Protocol

Team:Bielefeld-CeBiTec/Model

Organization of our modeling projects

Short Summary

Introduction

Overview

Method

ROSETTA Enzyme Design

Overview

Matching Step

Design Step

Results

Results in silico

Results in vivo