Team:TCFSH Taiwan/Model

Model Introduction

In our opinion, modelling has always played an important role in every subject, even beyond science. In our project, it comes up with real data, and thus make biological theories easier to be realized and observed. Carl Gauss said that “Mathematics is the queen of the science.” A proposition of mathematics is reliable and indisputable, whereas other science theories have always been in a risk of being overthrown. The reason why modelling has good reputation and a certain status is that it theorems the scientific phenomenon, and makes them more trustworthy. By conducting modelling, we can have a reasonable embryonic form to estimate a possible solution of a difficult problem. However, the reaction series or the operation mechanism of an unknown equation needs to be reasonably presumed, and this is the most difficult part in the whole process. After the right theories come out, we can amend our hypothetical surmise, and remake another model. In the modelling process we’ve done, the main technique we used is DE (differential equation). We use derivative to describe the difference of any variables within a very short time. But we’ve met some very complicated equations when solving the problem, so we use the program MATLAB to help calculate the results.

What are we modeling?


- The growth of E. coli

- The Expression of Different Color

- The Concentration Function f:(substance,time)→concentration

- Math Is Long, Life Is Short: Math in Our Life

Model

I. The growth of E. coli

At first, we assume that E. coli proliferate and die at the same ratio over time, and the value difference is the birth rate (μg). So, we do derivative with this assumption.

Substituting the boundary condition, t = 0, N = N0, we then have ∴ eC2-C1=N0 Thus, the equation that expresses the relation between bacteria and time is:

N = N0∙eμgt

What’s more, it is useless to say that E. coli consumes their “food”, LB, all the time. Thus, if E. coli consumes their food steadily, the LB consuming rate will be proportional to N, then we can write down the equation:

By substituting the boundary condition, we then have
𝐶= − 𝑛𝐿𝐵0/𝑘𝑐𝑜𝑛−𝑁0/𝜇𝑔

So the relation between nLB and t is:

II. The Expression of Different Color

Assumption

1. In order to write the equations down simply, we assume that all the chemical reaction rates are proportional to the concentration of each reagent (e.g. for the reaction: A+B+C→D+E,the forward rate r+=k+[A][B][C]).

2.For every substances produced by biobricks, we assume that their production rate =φ[CoPB],
[CoPB]= the concentration of the promoted biobrick

φ= the result of multiplication of rate constant, coefficient of correction (since a biobrick is different from a reagtant), a dimension T-1

Equations & Solutions

According to the picture (Figure 1), we can write down 3 equations as follows:

P.S. φ= the result of multiplication of rate constant, coefficient of correction (since a promoter is different from a reagtant), a dimension 𝑇−1

By solving these 3 equations, the solution expressed by φ、k and [Pa] are as follows:

When the concentration of each activated promoter reaches to each of their steady state, then we can simplify the equations as follows:

Besides, since limt→∞⁡(1-1/ekt = 1, satisfying the definition of the horizontal asymptotes. And (d(1-1/ekt)/dt=ke-kt>0 (t∈[0,∞)), so it is a strictly increasing function.
So, this is a strictly increasing and convergent function with an upper bound 1.

Then the result is that the extremum of the concentration is:

Degradation Rate Constant Calculation

As for the other variable written in the solutions (), the degradation rate constant, can also be solved with differential equations. Since the degradation rate is an “order one” reaction, the equation can be written as follow:

dM/dt= -kdM

Then, after solving the equation and substituting the boundary conditions
(t = 0⇒M = M0), the the solution is:

According to the project 2008 iGEM KULeuven and 2014 iGEM Edinburgh had done, both GFP-LVA and RFP-LVA degrades to half of the amount within 50 to 60 minutes, so we assume that cjblue is the same. The RFP and BFP reference are as follow (the latter degrades to half of the amount about 50 minutes while the former does about 3 hours). So we can get

From these degradation rate constants and the relation between concentration and time, the “[cjblue],[RFP],[BFP]-t Diagram” is as follow:

III. Step 1: Crawler

In the beginning, we searched on UniProtKB/Swiss-Prot. It is a freely accessible database of protein sequence and functional information that is the manually annotated and reviewed section. (http://www.uniprot.org/) By searching the keyword “insecticidal NOT crystal” we wanted to find all the proteins that have insecticidal activity excluding those crystal proteins of Bacillus thuringiensis, and we got 216 proteins as results.

Using the result, we established our Pantide database by crawling 11 entries of the protein information from UniProt. The entries are as follows.

  • The name of the protein
  • The description of protein function
  • The organisms/source of the protein sequence
  • The length of amino acids
  • The number of disulfides bonds
  • Propeptide & signal peptide—If the proteins have an N-terminal signal peptide and propeptide, a part of protein will be cleaved during maturation or activation.
  • Uniprot entry & Arachnoserver id—the accession number of protein in UniProtKB and ArachnoServer*.

*ArachnoServer is a manually curated database for protein toxins derived from spider venom.(http://www.arachnoserver.org/).

We also crawled other seven entries of protein toxicity recorded by Arachnoserver—molecular target, taxon, ED50, LD50, PD50, qualitative information, protein sequence from Arachnoserver. The term, Molecular target, is the effect site of toxin peptides, such as voltage-gated ion channels, GABA receptors and so on. Taxon, ED50, LD50, PD50, and the qualitative information are the toxicity against taxon that had been tested by experiments. The protein sequence from two databases is entirely the same.

We utilized BeautifulSoup 4.4.0, sqlite3 and gevent modules in Python 3.5 to develop our crawler. Moreover, we have submitted the code to GitHub.
(Link:https://github.com/chengchingwen/iGEM/blob/master/crawler.py)

IV. Step 2: Filter

After crawling the data, we used DB Browser for SQLite software to browse and used SQL to process our Pantide database. We tried to build a filter to find out peptides suitable to use as Pantide.

According to the previous articles, we knew that around 90% of spider venom toxin peptides contain ICK structure which is the most important domain that reacts with the voltage-gated ion channels of insects and some other receptors specifically. [2]

Therefore, to find these spider venom toxin peptides from Pantide database, we could start from searching for ICK structure, whose mass is among 1-10 kDa containing at least three disulfide bonds. [2] So we set a filter with three conditions.

  • The organism we choose must be spiders or tarantulas.
  • The length of the a.a. sequences are between 27 and 271 base pairs (1 kDa of protein has averagely nine amino acids, encoded by 27 base pairs)
  • The number of disulfide bonds is greater or equal to 3. After filtering with the three conditions, 113 peptides remained. Next, we set another filter to find out insecticidal peptides.
  • Molecular target contains “invertebrate,” but we also remain peptides without data.
          The reason why we keep the peptides without data was that they have the probability to be effective. In this stage, we got 63 candidates.
  • For efficacy experiment of Pantide, we choose our testee-Spodoptera litura as target insect. While there are 14 kinds of distinct Taxon in our database, including 4 Lepidoptera genus. Thus, we also set the other filter to find out peptides against Lepidoptera:

  • Taxon contains at least one of Spodoptera litura, Heliothis virescens, Manduca sexta and Spodoptera exigua, but we also remain peptides without data
          On the other hand, because we designed to produce Pantide by E.coli, that is difficult to express proteins containing disulfide bonds. We had chosen E.coli Rosetta-gami strain for enhanced disulfide bond formation, but to express a protein with more than four disulfide bonds is still a heavy load. So we finally filtered out those peptides containing too much disulfide bonds.
  • The number of disulfide bonds is less than or equal to four.
          The result was that we got 46 peptides which have the possibility to use as Pantide in proof concept experiment, and all of them is targeted to insects’ voltage-gated ion channels (excluding NULL).

Reference

[1] King, G.F.; Gentz, M.C.; Escoubas, P.; Nicholson, G.M. A rational nomenclature for naming peptide toxins from spiders and other venomous animals. Toxicon 2008, 52, 264–276.

[2] Monique J. Windley, Volker Herzig, Sławomir A. Dziemborowicz, Margaret C. Hardy, Glenn F. King and Graham M. Nicholson (2012). Spider-Venom Peptides as Bioinsecticides. Toxins, 4, 191-227.