Line 93: | Line 93: | ||
<div class="content" style="width: 74%"> | <div class="content" style="width: 74%"> | ||
− | + | ||
− | <img width="100%" src="https://static.igem.org/mediawiki/2017/3/36/T-NUDT_CHINA-MODEL5.jpg" alt=""> | + | <img width="100%" src="https://static.igem.org/mediawiki/2017/3/36/T-NUDT_CHINA-MODEL5.jpg" alt=""> |
Revision as of 18:48, 1 November 2017
Model
Theory design
In order to further optimize the structure of the default locker in the project, we assert that the binding probability of base complementary pairing should be reduced in the sequence of the functional module when the locker formed the secondary structure, thus increasing the binding probability of base complementary pairing between the functional module and the corresponding microRNA.
Figure 1. the predicted structure of Locker 2
The two pairs of overhangs (The four pale gray segments in Figure 1) that we design to be added between the complementary pairing supporting module and the functional module can open the port where the supporting module is connected to the functional module, but different sequences may have a potential impact on the secondary structure of the locker. So we design to build an overhang candidate library in order to find all possible sequences and predict the secondary structure of each sequence, compare them and select the relative optimal structure as the object of our experiment.
Reducing the probability of base complementary pairing occurred in the sequence of functional module in the formation of the secondary structure of locker will result in the reduction of the number of the paired bases of entire lockers. Therefore, we believe that for different sequences, when the number of base pairs reaches the minimum, functional module unmatched base number is the largest, the probability of corresponding to microRNA through the base complementary pairing is maximum and the efficiency is highest.
Methods
1.Building an overhang alternative library
Each overhang has 4 bases, so the number of all possible structures is :
For an overhang A, A2(which is paired with A1 ), A3( which is paired with A ) and the reverse sequence A1 are all included in the possible structures.
Take AGGC for example:
Given the one-to-one relationship between them, we consider that The ratio of the number of A, A1, A2 A3 is 1: 1: 1: 1. Accordingly, after removing all A1 and A3, the number of overhangs left in the library is:
For some of the A and A2, they have the same sequence so we have to subtract the corresponding number.
This means that the sequence itself is inversely complementary and the first two base pairs correspond one-to-one with the latter two base pairs, so the number of possible scenarios is:
The number of the rest is:
we use MATLAB to screen the above two situations and keep the remaining overhangs in the overhang alternative library.
2. Stitching sequence
Since the number of overhangs in the overhang library is 112 and we have to select two of them, which is the same as simple arrangement combination. Thus the number of possible locker is:
It takes a lot of time to accomplish the secondary structure prediction of so many sequences, so we consider completing the sequence splicing and secondary structure prediction using the computer.
3. Predicting the structure of Locker 2
We mimic the method of calculating the RNA sequence and use the minimum free energy algorithm to calculate and predict the secondary structure of the single - chain DNA. The RNA-related parameters are replaced by DNA parameters (Matthews model, 2004). The specific algorithm is described as follows:
Because the formation of base pairing can reduce the energy of DNA molecules and the structure is more stable, the minimum free energy algorithm considers that at certain temperature, RNA molecules can achieve a certain thermodynamic balance through the conformational adjustment, the free energy can be reduced to the minimum and the most stable structure will be formed, then the secondary structure is considered to be the true secondary structure of DNA. Its computational object is not the number of base pais, but a complex set of energy parameters.
Figure 2. Plain structure drawing of the single - chain DNA
Assume that a1,a2,a3...an is a given DNA sequence. Throughout this section, we let Ei,j denotes the minimum free energy of ai,ai+1...aj, which is computed and stored in arrays by a dynamic programming algorithm corresponding to the following recursions.
we use the notation αij to denote the free energy of a stacked base pair (ai,aj),βk to denote the free energy of a bulge,γk+1 to denote the free energy of an internal loop,εk+1-j'-i' to denote the free energy of a multiloop containing, while the free energy δj-1 for a hairpin.
Once E1,n is computed, then the minimum free energy structure can be computed by tracebacks.
4. Selecting the structure with the largest/minimum number of paired bases
In order to facilitate the comparison of the number of bases in each sequence, we directly read the number of the points in dot-bracket notation, and the sequence with the largest number of points means that the number of unpaired bases is the largest and the sequence pairing base number is the least In the case of the length of all the locks is equal, which means structure is relatively optimal.
Results
By calculating we can get structures of the selected lockers with the maximum number of pairs of bases, corresponding to the relatively worst and optimal structure.
The sequence of the optimal secondary structure:
AAGGGTAAGGATAATCGCACATGTTCTGCGGACCACTAAATCCTTACCCTTCGTAATCCTTGGCTACTGCCTGTCTGTGCCTGCTGTTCGGAAGGATTACG
The optimal secondary structure in dot-bracket notation:
((((((((((((...((((.......))))............))))))))))))(((((((((((((....)))....................))))))))))
Plain structure drawing of the optimal secondary structure:
Figure 3. Plain structure drawing of the optimal secondary structure
The sequence of the worst secondary structure:
AAGGGTAAGGATAACGGCACATGTTCTGCGGCCCACCAGCAAATCCTTACCCTTCGTAATCCTTACGGACTGCCTGTCTGTGCCTGCTGTGGCAAAGGATTACG
The worst secondary structure in dot-bracket notation:
((((((((((((((((.....)))).(((((....)).))).))))))))))))((((((((((((((((.....))))))...(((....)))))))))))))
Plain structure drawing of the worst secondary structure:
Figure 4. Plain structure drawing of the worst secondary structure
References
[1] Zuker M., Sankoff D. RNA secondary structures and their prediction, Bulletin of Mathematical Biology, 1984; 46: 591-621
[2] Zuker M., Mathews D.H.Turner D. H. Algorithms and thermodynamics for RNA secondary structure prediction: A practical guide. In RNA Biochemistry and Biotechnology. Boston: Kluwer Academic Publishers, 1999: 11-43