Team:Peshawar/Model

iGEM Peshawar

Modeling & Simulation

Welcome to our Modeling page. Modeling your system before actually building it is extremely important because it offers unique insights into design and execution. This year, our modeling approach was unique and multi-dimensional. We worked on a unique theory regarding reporter expression times - IDDP Theory, we calculated an estimate of the contraction that takes place in 2 of our promoters during transcriptional factor binding and also conducted homology modelling to predict the structure of the chromo-proteins we intend to use. You can read about all of these in detail, below:

IDDP Theory

Introduction

IDDP stands for Increase Device and Decrease Plasmid. Our theory was that if you have two cells, cell A and cell B, both containing the same number of a particular gene but cell A has 10 plasmids with one gene per plasmid while cell B has 5 plasmids with 2 genes (of same type) per plasmid, then regardless of the same number of genes in both cells, cell B will express in less time compared to the cell A.

Mathematically: Combine Polymerase time = (P1 _ J * C) + (P2 _ J * C) + (P3 _ J * C)

Where:

  • P1 _ J, P2 _ J and P3 _ J represent the jumps required by the RNA polymerase to reach the promotor of the gene embedded in a plasmid.
  • C is a complex disassociation time which is 3 sec, in our imaginary case.
  • Proved

    To try and prove the concept, our team built a software program where both the conditions of cell A and B could be mimicked and compared and the results could be easily obtained. The software is optimized for any number of RNA polymerase present any at position in both of cells to observe the results.

    The software can be used by anybody in the world to prove this theory.

    Input to our project

    As our project is based on the expression of chromoproteins for the detection of heavy metals. Increasing number of devices per plasmid will decrease the expression time and ultimately detect quicker. Simply put, faster the expression of chromo protein, faster the detection!

    The faster the expression of chromo protein, the faster the detection
    IDDP stands for Increase Device and Decrease Plasmid Theory, it states that if we halve the number of plasmids but double the number of device per plasmid then we will going to have detectable expression in less time.

    For example: a cell A contains 300 plasmids with each plasmid containing 1 device and cell B contains half the number of plasmids i.e. 150 but double the number of device per plasmid that is 2 then the detectable expression of cell B will be in less time compared to cell A.

    Let’s zoom into both of the cells and look at single plasmid of each cell, take a picture at any instance of time and analyze that picture to see the effect of our doubling of device per plasmid. Let's suppose:

    Plasmid of the cell A is of a length 2070 nucleotide base pairs. Operon of the plasmid of cell A starts at the position 10.

    Number of nucleotide per plasmid = 2070
    Device 1 position = 10

    There are three RNA polymerase at three different location in a nonspecific DNA Jumping in a search of Operon, at position 75, 1300, and 2003 in a direction clockwise, anticlockwise, and anticlockwise respectively.

    RNA polymerase 1 position = 75
    RNA polymerase 1 direction = clockwise
    RNA polymerase 2 = 1300
    RNA polymerase 2 direction = anticlockwise
    RNA polymerase 3 = 2003
    RNA polymerase 3 direction = anticlockwise

    Now in order to check the time require by these three RNA Polymerase to reach operon collectedly can only be found with having the value of Association constant and Dissociation rate which we imagine to be:

    Association Constant of RNA Polymerase with nonspecific DNA = 5 × 106 M-1
    Complex dissociate = 3 sec

    As the value for single jump is given we can calculate the total number of jumps to reach the operon and in turn, the total time require by RNA polymerase to reach the operon.

    Now the same goes with the Plasmid of cell B but in this case the number of devices are two:

    Number of device per plasmid = 2
    Device 1 position = 10
    Device 2 position = 1500

    Formula for Calculating Time Required by these three polymerase will be given by:

    Combine Polymerase time = (P1 _ J * C) + (P2 _ J * C) + (P3 _ J * C)
  • P1 _ J, P2 _ J and P3 _ J represent the jumps required by the RNA polymerase to reach the promotor of the gene embedded in a plasmid.
  • C is a complex disassociation time which is 3 sec, given above. But the problem here is we don’t know whether the polymerase is moving clockwise or anticlockwise as the plasmid is circular, so this problem can only be solved by conditions, thus we made the algorithm and then a C++ program on that algorithm that can take values from you at whatever position you want to place the polymerase, gene and with whatever the length of plasmid you want, it will give you the output of comparative time require by Cell A and B’s polymerase to reach the operon inside of a single plasmid.

    Parameters

    For cell A:
    Number of plasmid = 300
    Number of device per plasmid = 1
    Number of nucleotide per plasmid = 2070
    Association Constant of RNA Polymerase with nonspecific DNA = 5 × 106 M-1
    Complex dissociate = 3 sec
    RNA polymerase 1 position = 75
    RNA polymerase 1 direction = clockwise
    RNA polymerase 2 = 1300
    RNA polymerase 2 direction = anticlockwise
    RNA polymerase 3 = 2003
    RNA polymerase 3 direction = anticlockwise
    Device 1 position = 10

    For cell B:
    Number of plasmid in cell b = 150
    Number of device per plasmid = 2
    Number of nucleotide per plasmid = 2070
    Association Constant of RNA Polymerase with nonspecific DNA = 5 × 106 M-1
    Complex dissociate = 3 sec
    RNA polymerase 1 position = 75
    RNA polymerase 1 direction = clockwise
    RNA polymerase 2 = 1300
    RNA polymerase 2 direction = anticlockwise
    RNA polymerase 3 = 2003
    RNA polymerase 3 direction = anticlockwise
    Device 1 position = 10
    Device 2 position = 1500

    C++ Code

    View Code

    Result

    It can be clearly observed below that as we enter the parameters into the above C++ program, the results show a considerable decrease in time for a combined effort of 3 RNA polymerase molecules to reach the operon, that means the time required for the expression of our reporter gene shall decrease with increasing the number of device per plasmid.

    Hence it is proved that if we halve the number of plasmids but double the number of device per plasmid then we are going to have detectable expression in less time compared to cells thar contain plasmids with one device.

    Input to our project

    We need to use two devices per plasmid in case of our Arsenic and cadmium devices instead of one device per plasmid, so that the expression of their chromoprotein will happen much earlier and the goal of detection will be achieved much earlier.

    The faster the expression of chromo protein, the faster the detection.

  • IDDP Theory Experimental Proof

    Introduction

    This theory states that if we increase number of devices in a single plasmid will have a lower expression rate than an equal number of cells having number of plasmids doubled, each with a single device.

    PROOF OF CONCEPT

    In order to prove this concept, we carried out an experiment in the lab. We chose to work with one of the most used parts from the iGEM registry, an RFP coding device; BBa_ J04450. This device is also preferred to be used for testing the competent cells and also recommended to be as a positive control for transformations. It gives light purplish-red color under visible light after about 18hours and bright pinkish-red under

    PLASMID reconstruction

    We joined together two RFP-coding devices first and then inserted in pSB1C3 plasmid backbone.

    Distance between -35 and -10 base pairs is 19 base pairs i.e.

    Growth after transformations of cells having plasmids with double RFP devices.

    For comparison, we also ligated RFP-coding devices in two different plasmids i.e. pSB1C3 and pSB1A3, and transformed them both together.

    Transformed colonies cells having double plasmids with single devices.

    To justify the theory, two different measurements were taken into account:

    1: TIME vs OD MEASUREMENT

    We took the following four samples and compared their results.

  • Control ( having no bacteria)
  • RFP (bacteria having one RFP gene in pSBIC3)
  • RFP 2 (bacteria having 2 RFP genes in pSBIC3)
  • D RFP (bacteria having 1 RFP gene in pSBIC3 and one in pSBIA3)
  • For each sample 5ml fresh LB broth, 5ul of antibiotic and 200ul of the overnight culture was taken in the 50ml falcon tubes and were placed in the shaking incubator.

    Picture taken after one hour of incubation

    After placing the tubes in the incubator, optical density was checked at:

    Time A= 0mins, time B= 30mins, time C= 60mins, time D= 90mins, time E= 120mins, time F= 150mins, time G= 180mins and time H= 210mins.

    This graph clearly shows that even after taking almost same number of cells RFP expression happened to be faster than all other samples.

    2: RFP EXPRESSION MEASUREMENT

    It’s vivid from the graph that RFP2 and RFPD have almost the same number of cells showing O.D of 1.335 and 1.350 respectively. But when it comes to color change, bacteria having RFP 2 changed its color earlier than D RFP which means that bacteria having RFP 2 have lower expression time.

    Color change appeared in RFP2 after 5 hours of incubation

    Gradual difference in the colors of all of cultures is also shown below:

    Picture A, B, C and D were taken after 7hrs, 12hrs, 17hrs and 24hrs of incubation.

    DNA Contraction

    Introduction

    As cadmium and arsenic bind to the Mer-R and Ars-R transcription factors respectively, these TF's cause the promoter region to contract. To determined these conformational changes, our team has quantitatively measured the values by creating a mathematical equation.

    Proved

    Change in the Promoter = A*C
    Change in the promoter = 6.8 Angstrom Contraction
    Where:
    A is the pitch of the DNA and C is the number of base pairs changed.
    Hence,
    Change in the promoter = 6.8 Angstrom contraction.

    Input to our project

    Few of our biosensing devices such Arsenic and Cadmium device, transcriptional factors like ArsR and MerR are important elements which too cause changes in the promoter regions.Through our mathematical equations, we can further annotate the data regarding our research i.e. characterize the promoters of our own devices.

    What exactly is a quantitative value of that change that is making the gene to express and perform its function.

    Introduction

    As a cadmium and arsenic binds to Mer-R and arsR, the change happened in the operon of the promotor, what are those changes, their molecular basis along with their quantitative measurement is explored in this paper.

    Discussion

    Our cadmium and Arsenic device is made up of a promotor with repressor, one RBS, Florescent protein and Terminator. The Mer-R and Ars-R Protein has two sides, the S terminal that binds to the cadmium and the N terminal that binds to DNA. As the cadmium and Arsenic binds to Mer-R and Ars-R, a conformational change occur in the S terminal and pass to the DNA via an N terminal, a conformational change occur in the DNA i.e. it contracts and the sigma 70 get fit in it and RNA Polymerase follows the pattern of sigma 70 and binds so, the transcription get started.

    Calculations

    These are the molecular basis of the changes that occur in the DNA, now in order to measure those changes, we need to adopt a quantitative approach.

    Distance between -35 and -10 base pairs is 19 base pairs i.e.

  • X = 19 bp
  • The sigma 70 need 17 base pairs to perfectly fit and start transcription.

  • Y = 17 bp
  • So, the contraction in the promotor region of the DNA will be given by:

  • C = X-Y
  • We know the pitch of the helix of the DNA that is 3.4 Angstroms.

  • A=3.4 Angstroms.
  • So, the contraction will be:

    Contraction in the Promotor = A*C
    Contraction in the Promotor = 3.4*2
    Contraction in the promotor = 6.8 Angstrom
    Where:
    A=3.4 Angstrom
    C=X-Y

    Homology Modeling of CP Used in Cadmium Device

    Introduction

    Since our project relies on reporter genes, we used basic bioinformatics homology modeling to predict the 3-d structure of 2 of the chromoproteins used in our project for detection. One, Bba K592009- AmilCP Blue is a commonly used reporter in iGEM and the other Bba K2518000 GFP like Lilac colored chromoprotein from Gonoporia tenuidens.

    1. Identify the related structural templates of the target sequence.
    2. Select the best template.
    3. Align the target sequence with the template structure.
    4. Build a model for the target.

    Model

    Methodology

    The following bioinformatics analysis algorithmic steps were used in this research to predict the 3D structure for AmilCP Blue.

    1. Identify the related structural templates of the target sequence.
    2. Select the best template.
    3. Align the target sequence with the template structure.
    4. Build a model for the target.

    Sequence search

    The protein sequence that we were using was searched against the sequence of the proteins whose structures have already been built.

    Following is the list of templates found similar to our target sequence:

    Template selection

    It is evident from the image above that the template 3vic.1.A is the best match for the target sequence as it has 95.89 % identity to the target sequence. 3D Structure of the selected template 3vic.1.A is given below:

    Aligning the Target sequence with template structure

    Below is the model-template alignment:

    Predicted Structure

    This is a 3D structure that we build through homology modeling between template and target structure and sequence respectively.

    Result and Discussion

    GMQE (Global Model Quality Estimation):

    GMQE, Stands for Global Model Quality Estimation, is a parameter for checking the quality of target template alignment and templet search. The value for this parameter varies from 0 to 1, the more positive the result is the more positive the alignment and search will be. GMQE Value score for our work is 0.99, this is a good result.

    QMEAN:

    QMEAN is a scoring function that looks for a global and well as local quality estimation of the residues found in the homology model. Results higher than - 4.0 are considered as good results but the lower results shows a bad agreement between the model and the template. Our QMEAN Score is 0.48 which is a pretty good score.

    LocalQuality:

    Local quality calculates the similarity between each residue of the homology model against the structurally similar pdb model. If the value is laying below 0.6 the model is not of good quality but our results almost reaches the value of 1, which is a relatively good result for our model.

    Comparison plot:

    In a comparison plot, the x axis shows the length of the modeled protein represented as residues and the y axis shows the QMEAN score. Every dot in a graph represent a non-redundant 3d protein model. There are three types of dot present in the graph sorted by different QMEAN Scores given below. It is evident from the graph below that our build model (RED STAR) is in a dark gray portion, which is actually a pretty good score and in turn is proof that the homology model we built is of good quality.

    It is evident from the homology analysis above that the model build of our given FP is of Good quality.

    References

    • Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Cassarino TG, Bertoni M, Bordoli L, Schwede T (2014). SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information Nucleic Acids Research 2014 (1 July 2014) 42 (W1): W252-W258
    • Kiefer F, Arnold K, Künzli M, Bordoli L, Schwede T (2009). The SWISS-MODEL Repository and associated resources. Nucleic Acids Res. 37, D387-D392. 
    • Arnold K, Bordoli L, Kopp J, and Schwede T (2006). The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling. Bioinformatics.,22,195-201. 
    • Guex, N., Peitsch, M.C. Schwede, T. (2009). Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: A historical perspective. Electrophoresis, 30(S1), S162-S173.

    Homology Modeling of CP Used in Arsenic Device

    Methodology

    The following bioinformatics analysis algorithmic steps have been used in this research to predict the 3D structure and some other properties of the FP used in Arsenic Device.

    1. Identify the related structural templates of the target sequence.
    2. Select the best template.
    3. Align the target sequence with the template structure.
    4. Build a model for the target.

    Sequence search

    Following is the list of templates found similar to our target sequence:

    Template selection

    It is evident from the image above that the template 1mov.1.A is the best match for the target sequence as it has 95.89 % identity to the target sequence. 3D Structure of the selected template 3vic.1.A is given below:

    Aligning the Target sequence with template structure

    Below is the model-template alignment:

    Predicted Structure

    This is a 3D structure that we build through homology modeling between template and target structure and sequence respectively.

    Result and Discussion

    GMQE (Global Model Quality Estimation):

    GMQE, Stands for Global Model Quality Estimation, is a parameter for checking the quality of target template alignment and templet search. The value for this parameter varies from 0 to 1, the more positive the result is the more positive the alignment and search will be. GMQE Value score for our work is 0.98, this is a good result.

    The value of GMQE for our homology model is 0.98 that is a positive result. Hence the build model is of good quality.

    QMEAN:

    QMEAN is a scoring function that looks for a global and well as local quality estimation of the residues found in the homology model. Results higher than -4.0 are considered as good results but the lower results shows a bad agreement between the model and the template. Our QMEAN Score is -0.13 which is a pretty good score.

    Our QMEAN value is -0.13, which is a good result, showing the model is of good quality.

    LocalQuality:

    Local quality calculates the similarity between each residue of the homology model against the structurally similar pdb model. If the value is laying below 0.6 the model is not of good quality but our results almost reaches the value of 1, which is a relatively good result for our model.

    Comparison plot:

    In comparison plot, the x axis shows the length of the modeled protein represented as residues and the y axis shows the QMEAN score. Every dot in a graph represent a non-redundant 3d protein model. There are three types of dot present in the graph sorted by different QMEAN Scores given below. It is evident from the graph below that our build model (RED STAR) is in a dark gray portion, which is actually a pretty good score and in turn is prove that the homology model we build for our FP is surely is of a good quality.

    It is evident from the homology analysis above that the model build of our given FP is of Good quality.

    References

    • Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Cassarino TG, Bertoni M, Bordoli L, Schwede T (2014). SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information Nucleic Acids Research 2014 (1 July 2014) 42 (W1): W252-W258
    • Kiefer F, Arnold K, Künzli M, Bordoli L, Schwede T (2009). The SWISS-MODEL Repository and associated resources. Nucleic Acids Res. 37, D387-D392. 
    • Arnold K, Bordoli L, Kopp J, and Schwede T (2006). The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling. Bioinformatics.,22,195-201. 
    • Guex, N., Peitsch, M.C. Schwede, T. (2009). Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: A historical perspective. Electrophoresis, 30(S1), S162-S173.

    Dynamic Software to prove IDDP Theory

    Introduction

    We have built a dynamic console based c++ software that can not only solve the problems of calculation of IDDP Theory but also is a general prove of IDDP due its dynamic nature. You can add a plasmid of whatever length with where ever you want to place the RNA polymerase and the device, it will always give you results that can satisfy the IDDP Theory.