Team:CSMU NCHU Taiwan/Modeling

Inter Lab

Modeling

Overview

Our project is to study MSMEG5998, an enzyme that can break down the aflatoxin.
In order to produce accurately folded MSMEG5998, we join the enzyme with another protein, Thioredoxin, which can improve the performance of protein folding, which in our case is MSMEG5998 . With the fusion protein that we created, our aim is to create a high efficiency protein on degrading aflatoxin.
The modeling project is divided into two parts: Protein structure modeling and docking simulation.
First, we developed a 3D protein model that can predict the structure of fusion protein and tell us whether the fusion protein is misfolded or not.
Since the active sites of MSMEG5998 toward aflatoxin (ligand) has not been studied, we predict the binding position of enzyme with aflatoxin. Then we use the 3D model to simulate the correct binding position, thus, help us improve the accuracy of fusion protein in wet lab experiments.

Our experiment is carried out in two different aspects:
1. Building the 3D model of the fusion protein
2. Create a docking simulation of the fusion protein, including the active sites of theoredoxin and also the binding position of MSMEG5998 with aflatoxin

Protein Structure Modeling

A. Overview
1. The fusion protein is a combination of two different functional proteins: MSMEG5998 and Thioredoxin. The two different proteins are combined by a linker.
2. The first challenge we’re facing is that there is no existing structure of this protein. We only know that MSMEG5998 belongs to FDR-A family (2), since there is no exact structure of MSMEG5998, so we try to build a reliable model for the purpose below:
i. To visualize the stereoscopic structure of the two proteins
ii. To make sure that there is no mutual bonding between the proteins, which can cause misfolding.

B. First of all, we use NCBI to determine the protein sequence we want:

C. Next is to insert a linker in to the two proteinsC

D. A. Visualize the fusion protein model
1. By using RaptorX, the protein sequence can be exported in a PDB file.

2. Visualize the structure by using PyMOL

Docking Modeling

A. Overview
Simulate the binding position of aflatoxin and the fusion protein
In order to assure our fusion protein can be functional or even with a higher performance as expected, the team detected the possible active sites of the proteins in our project and then stimulated the docking process. By doing so, we are expecting to observe the performance of the fusion protein, and more importantly, to inspect on the improvements from the new protein comparing to the original ones.

Please notice that the fusion protein is merged with two different proteins, which is MEMEG5998 and Thioredoxin. Therefore, in the lateral discussion, docking simulation contains two different protein-ligand model, which are “Thioredoxin-Fusion protein” model and” MSMEG5998-aflatoxinB2” model.

B. The docking simulation of “Thioredoxin-Fusion protein”
1. Since the structure of Thioredoxin has been studied, we can lock down the active site of thioredoxin by use Uniprot. The team found that there are two active site , which are NO. 33 and NO.36 of the sequence
2. By using NCBI BLAST, the team compared the sequence of the fusion protein with Thioredoxin. The team confirmed that the active sites of fusion protein corresponding to the ones of Thioredoxin are No.33 and 36 , both are Cysteine, C.

3. The team later on constructed a fusion protein 3D model and then labelled the active sites by using PyMOL. By creating the model the team could learn why thioredoxin is helpful toward protein folding, since the active sites of Thioredoxin is not facing away from MSMEG5998.

C. The structure of the fusion protein (MSMEG5998 part):
1. While the structure of MSMEG5998 remains unknown, the team still manage to predict the model by using similar protein to create a model, the software tool we used is Swiss Model.
2. When deciding the model of MEMEG5998, the team used the Swiss Model, by comparing the amino acid sequence among the database of protein sequence. There are two main factors leads to two different models, which is by coverage or by identity. The team chose the highest coverage protein sequence to be our model, named” MSMEG5998 Swiss model”.

3. The sequence of the MSMEG5998 Swiss model is compared with that of fusion protein by using Uniprot. The team then discovered three similar groups being labeled below, which are likely active sites.

1. The three loci correspond to the fusion protein sequence are:
i. 189,Arginine,R
ii. 214,Glutamine,Q
iii. 246,Alanine,A

D. Further enhancements to the compound before protein docking simulation on MEMEG5998:
1. Since the .pdb files presented by raptorX were unavle to visualize hydrogen bonds of the compound , thus the team used PMViewer v1.5.7 to add on hydrogen bonds and negative charge. (the following pictures are compounds before and after enhancements)

E. Adding ligand to the docking simulation of MSMEG5998-Aflatoxin B2
1. Search PubChem to locate the ligand, which in this case is AflatoxinB2, and then download the SDF format.

F. The docking of MSMEG5998 to AflatoxinB2:
1. The settings for Aflatoxin B2 before docking:
i. Minimize the energy, in order to acquire a stabilized compound which is easier to go through the docking simualtion.

2. Select the docking function to proceed.

G. The autodocking area are limited to the three active sites of MSMEG5998 mentioned earlier, which can increase the model’s accuracy. After autodocking, we visualize the result by using PyMOL to create a 3D docking model. The three active sites for docking are listed below:
1. 189,Arginine,R:

2. 214,Glutamine,Q

3. 246,Alanine,A

Discussion and Conclusion

A. Results and discussion
1. The docking simulation model: Thioredoxin-fusion protein
i. The active site of Thioredoxin is partially faced inward to MSMEG5998, we speculate that Thioredoxin can help the protein folding progress on a certain degree.
2. The docking simulation model: MSMEG5998-Aflatoxin B2
i. Out of the three speculated active sites(No.189,214 and 246),No.214 is most likely to be the binding site of Aflatoxin B2, based on the Swiss model and the results from Autodocking.
ii. The binding site (No.214) is not blocked by the rest of the structure(thioredoxin)
iii. Binding site(No.214) is located on the surface of the protein and facing outwards, which is prone to react with Aflatoxin B2.

B. Conclusion
1. By using protein modeling techniques, the team predicted a fusion protein with multifunction while one doesn’t inhibit the other, or creating structural failure.
2. With the software tools, the team is able to predict a enhanced fusion protein(MSMEG5998 combined with Thioredoxin) that performs better than the original protein(MSMEG5998).
3. With the cooperation of the wet lab projects, the team is able to confirm the results of the prediction.

Overview

Protein Structure Modeling

Docking Modeling

Discussion & Conclusion