Team:IISER-Pune-India/Model

MathJax TeX Test Page
Modelling

Introduction

Mathematical modelling often gives useful insights into the behaviour of complex systems. We model the synthetic gene oscillator circuit made by Stricker et al.(2008), where the protein product of the first gene (AraC) activates the expression of both the genes and the protein product of the second gene (LacI) inhibits expression of both the genes. It is expected that such a model would help test what modifications can be done in the mentioned oscillator, to couple it with other genes to achieve oscillations in their protein products, which are essential in the cell cycle.

The oscillator described by Stricker et al. (2008) comprises of two genes AraC and LacI, placed downstream of a $p_{lac/ara-1} $hybrid promoter, which is activated by the binding of AraC protein in the presence of arabinose and repressed by LacI protein in the absence of IPTG. This forms a dual feedback loop that causes the levels of these proteins to oscillate periodically.

The model

The reactions in the AraC-LacI oscillatory system are as follows- \[P^{a/r}_{0,j} \ + \ a_2 \xrightleftharpoons{k_1} \ P^{a/r}_{1,j} \] \[P^{a/r}_{i,0} \ + \ r_4 \xrightleftharpoons{2k_2} \ P^{a/r}_{i,1} \] \[P^{a/r}_{i,1} \ + \ r_4 \xrightleftharpoons{0.5 k_2} \ P^{a/r}_{i,2} \] Where $a$ is the Concentration of AraC protein, $r$ is the Concentration of LacI protein. $P^{a/r}_{i,j}$ represent the states of promoters on the (a)ctivator/(r)epressor plasmids with i number of AraC dimers $(a_2)$ bound and $ j \ \ \epsilon \ (0, 1, 2) $ LacI tetramers $(r_4)$ bound. The equilibrium constants $k_1,k_2$ are dependent on IPTG and Arabinose concentration as given by- \[k_1 = \frac{[ara]^2}{[ara]^2 + (2.5^2)} .\frac{1}{1 + (\frac{[iptg]}{1.8})^2}\] \[k_2 = 2*(0.19 . \frac{1}{1 + (\frac{[iptg]}{0.035})^2} + 0.01)\] Transcription reactions- \[ P^{a/r}_{0,0} \xrightarrow{b_a} \ P^{a/r}_{0,0} + m_{a/r}\] \[ P^{a/r}_{1,0} \xrightarrow{\alpha b_a} \ P^{a/r}_{0,1} + m_{a/r}\]

where $m_{a/r}$ represents the number of mRNA molecules of AraC/LacI genes. When an AraC dimer is bound too the promoter, i.e. when promoter is in state $P^{a/r}_{1,0}$, transcription rate is $\alpha$ times that in unbound state. When any number of LacI tetramers are bound, i.e $ P^{a/r}_{i,1}$ or $P^{a/r}_{i,2}$, there is no transcription.

Translation and Protein folding reactions- \[ m_a \xrightarrow{t_a} m_a + a_{uf}\] \[ m_r \xrightarrow{t_r} m_r + r_{uf}\] \[a_{uf} \xrightarrow{k_{fa}} a\] \[r_{uf} \xrightarrow{k_{fa}} r\] $a_{uf}$ and $r_{uf}$ are the unfolded monomeric versions of the activator and repressor. Dimerisation and Tetramerisation- \[a \ + \ a \xrightleftharpoons{k_{da}} \ a_2 \] \[r \ + \ r \xrightleftharpoons{k_{dr1}} \ r_2 \] \[r_2 \ + \ r_2 \xrightleftharpoons{k_{dr2}} \ r_4 \]

a and r are the folded monomeric versions of activator and repressor; $a_2$ and $r_2$ are the dimeric versions of activator and repressor; and $r_4$ represents the tetrameric version of the repressor. $k_{da}, k_{dr1}, k_{dr2}$ are the equilibrium constants of dimerisation and tetramerisation

Degradation- \[a \xrightarrow{k_a} \phi \] \[r \xrightarrow{k_r} \phi \] \[m_a \xrightarrow{\gamma_{ma}} \phi \] \[m_r \xrightarrow{\gamma_{mr}} \phi \] \[a_{uf} \xrightarrow{k_a} \phi \] \[r_{uf} \xrightarrow{k_r} \phi \] We ignored protein looping, which is considered by Stricer et al. (2008) in their model. The reaction rates and other parameter values considered are given in the following table- $$ \begin{array}{|c|c|c|c|c|} \hline \text{Sr.} & \text{Parameter} & \text{Description} & \text{value} & \text{units} \\ \hline 1& \gamma_{ma}, \gamma_{mr} & \text{Degradation rate of AraC/LacI mRNA} & 0.54 & min^{-1} \\ \hline 2 & b_a,b_r & \text{Transcription rate of AraC, LacI genes} & 0.36 & min^{-1} \\ \hline 3 & \alpha & \text{Increased transcription rate Due to AraC binding} & 20 & \\ \hline 4 & t_{a} & \text{rate of transcription of AraC} & 90 & min^{-1} \\ \hline 5 & t_{r} & \text{rate of transcription of LacI} & 90 & min^{-1} \\ \hline 6 & k_{a} & \text{Rate of degradation of AraC protein} & 0.41 & min^{-1} \\ \hline 7 & k_{r} & \text{Rate of degradation of LacI protein} & 0.163 & min^{-1} \\ \hline 8 & k_{da} & \text{AraC dimerisation constant} & 100 & molecules^{-1} \\ \hline 9 & k_{dr1} & \text{LacI dimerisation constant} & 100 & molecules^{-1} \\ \hline 10 & k_{dr2} & \text{LacI tetramerisation constant} & 100 & molecules^{-1} \\ \hline 11 & k_{fa}, k_{fr} & \text{Rate of folding of proteins} & 0.9 & min^{-1} \\ \hline 12 & n_a & \text{copy number of Plasmid having AraC} & 50 & molecules \\\hline 13 & n_r & \text{copy number of Plasmid having LacI} & 25 & molecules \\ \hline \end{array} $$

When we solved the differential equations corresponding to each reaction using Simbiology in MATLAB, however, we were unable to reproduce the results reported by Stricker et al. (2008). This was probably because of the complexity of the model. As a solution, a bottom-up approach of modelling was chosen, in which a simple model is built first and equations, parameters are added as required. The simple model, in this case, was made after the model of the cI-Lac oscillator described by Hasty et al.(2001). We fisrt reproduced the results of the cI-Lac model. The two first order coupled differential equations were solved using Python 3.6 in Spyder 2.3.2 environment, a range of parameter values was scanned to obtain oscillations. The main difference between the two oscillators is that the promoter in cI-Lac oscillator has two binding sites for cI protein and one for LacI tetramer, whereas the promoter in Ara-Lac oscillator has one binding site for AraC protein and two sites for LacI binding. We then modified the model to fit the ara-lac system and incorporated IPTG, arabinose inducibility into the model. This model did not include the intermediate reactions of transcription and translation and also assumed that the reactions are dimerisation and tetramerisation are fast, and equilibrium is attained instantaneously. These assumptions were relaxed by adding four more differential equations, that of mRNA production and protein unfolding for both AraC and LacI; and also by adding a delay in the differential equation of mRNA.

The production of mRNA is governed by the differential equation- \begin{align} \frac{dm_a}{dt} = b_a(P^{a}_{0,0} + \alpha P^{a}_{1,0}) - \gamma_{ma} m_a \\ \frac{dm_r}{dt} = b_r(P^{r}_{0,0} + \alpha P^{r}_{1,0}) - \gamma_{mr} m_r \end{align}

Here, $P^{a/r}_{0,0}, P^{a/r}_{1,0}$ are also variables, and the differential equation above is coupled to the rate equation governing promoter binding reactions for LacI and AraC. We assume that these reactions are fast reactions and attain equilibrium very quickly. Thus, we can write- \begin{align} P^{a/r}_{1,0} = k_1 a_2 P^{a/r}_{0,0} \\ P^{a/r}_{i,1} = 2k_2 r_4 P^{a/r}_{i,0} \\ P^{a/r}_{i,2} = k_2 r_4 P^{a/r}_{i,1} \\ \end{align} The dimerisation and tetramerisation reactions give us- \begin{align} \begin{split} a_2 & = k_{da} a^2 \\ r_2 & = k_{dr1} r^2 \\ r_4 & = k_{dr2} r^2_2 \\ & = k_{dr2} k^2_{dr1} r^4\\ \end{split} \end{align} We define $c_a := k_1 k_{da}$ and $c_R := k_2 k_{dr2} k^2_{dr1}$. Which gives us - \begin{align} \begin{split} P^{a/r}_{1,0} & = k_1 k_{da} a^2 P^{a/r}_{0,0} \\ & = c_a a^2 P^{a/r}_{0,0} \\ P^{a/r}_{0,1} & = 2 k_1 k_{dr2} k^2_{dr1} r^4 P^{a/r}_{0,0} \\ & = 2c_r r^4 P^{a/r}_{0,0} \\ P^{a/r}_{0,2} & = k_1 k_{dr2} k^2_{dr1} r^4 P^{a/r}_{0,1}\\ & = 2 k^2_1 k^2_{dr2} k^4_{dr1} r^8 P^{a/r}_{0,0} \\ & = c^2_r r^8 P^{a/r}_{0,0} \\ P^{a/r}_{1,1} & = 2c_r r^4 P^{a/r}_{1,0} \\ & = 2c_a c_r a^2 r^4 P^{a/r}_{0,0} \\ P^{a/r}_{1,2} & = 2c_r r^4 P^{a/r}_{1,1} \\ & = c_a c^2_r a^2 r^8 P^{a/r}_{0,0} \\ \end{split} \end{align} Substituting this in above equations, we get \begin{align} \frac{dm_a}{dt} = b_a P^{r}_{0,0}(1 + \alpha c_a a^2 ) - \gamma_{ma} m_a \\ \frac{dm_r}{dt} = b_r P^{r}_{0,0}(1 + \alpha c_a a^2 ) - \gamma_{mr} m_r \\ \end{align} Also, The total number of promoters in all states remains constant in the cell as time progresses. So we impose the condition- \begin{align} n_a = \sum_{i =0,1} \sum_{j=0,1,2}P^{a}_{i,j} \\ n_r = \sum_{i =0,1} \sum_{j=0,1,2} P^{a}_{i,j} \\ \end{align} Rewriting $P^{a/r}_{i,j}$ in terms of $P^{a/r}_{0,0}$, factorising, simplifying and rearranging - \begin{align} P^{a}_{0,0} = \frac{n_a} {(1 + c_a a^2)(1+ 2c_r r^4 + {c_r}^2 r^8) }\\ P^{r}_{0,0} = \frac{n_r} {(1 + c_a a^2)(1+ 2c_r r^4 + {c_r}^2 r^8) } \\ \end{align}

We substitue this in the above equations, to obtain two nonlinear equations for mRNA production. \begin{align} \frac{dm_a}{dt} &= \frac{n_a b_a (1+\alpha c_a a^2)}{(1 + c_a a^2)(1+ 2c_r r^4 + {c_r}^2 r^8)} - \gamma_{m_a} m_a \\ \frac{dm_r}{dt} &= \frac{n_r b_r(1+\alpha c_a a^2)}{(1 + c_a a^2)(1+2c_rr^4 + c_r^2 r^8)} - \gamma_{m_r} m_r \\ \end{align} However, these equations are not sufficient to model the entire system of the oscillator. As remarked by Stricker et al. (2008) in their paper, delay in intermediate steps is an important factor in getting sustained oscillations. One important assumption in our model till now has been that the dimerisation and tetramerisation reactions attain equilibrium instanteneously. We relax this assumption by adding a delay term. We assume that the dimerisation of AraC takes time $\tau_a = 2 min$ and tetramerisation of LacI takes time $\tau_r = 3 min$. The justification for differential delay is that tetramerisation of LacI takes more time than dimerisation of AraC. Hence, mRNA production at time t depends on time $t - \tau_{a/r}$. So, In the end, we have six coupled first order delay differential equations-

\begin{align} \frac{dm_a}{dt} &= \frac{n_a b_a (1+\alpha c_a a^2(t - \tau_a))}{(1 + c_a a^2(t - \tau_a))(1+ 2c_r r^4(t - \tau_r) + {c_r}^2 r^8(t - \tau_r))} - \gamma_{m_a} m_a \\ \frac{dm_r}{dt} &= \frac{n_r b_r(1+\alpha c_a a^2(t - \tau_a))}{(1 + c_a a^2(t - \tau_a))(1+2c_rr^4(t - \tau_a) + c_r^2 r^8(t - \tau_r))} - \gamma_{m_r} m_r \\ \frac{da_{uf}}{dt} &= t_a m_a - k_{fa} a_{uf} - \gamma_{a_{uf}} a_{uf} \\ \frac{dr_{uf}}{dt} &= t_r m_r - k_{fr} r_{uf} - \gamma_{r_{uf}} r_{uf} \\ \frac{da}{dt} &= k_{fa} a_{uf} - k_a a \\ \frac{dr}{dt} &= k_{fr} r_{uf} - k_r r \\ \end{align}

We solved these equations using the PyDDE package in Python3.6 and scanned the IPTG and Arabinose concentrations. Time period and Amplitude was found by using a custom written peak finding algorithm.

Cases tested

  • Concentration of IPTG was scanned from 0 mM to 30 mM. It repeated for different Arabinose concetrations ranging from 0% to 3%.
  • To check if our oscillator would work with both the genes on a single plasmid, we changed the plasmid copy number for LacI gene to $n_r = 50$. Concentrations of Arabinose and IPTG were kept constant at 0.7% and 3mM repectively.
  • To check whether a diffenece in RBS efficiency could compansate for equal copy numbers, we multiplied transcription factor by RBS efficiency. In our case, RBS efficiency ratio of AraC gene to LacI gene is $5:3$, which gives $t_a = 150$ min-1 and $t_a = 90$ min-1, Concentrations of Arabinose and IPTG were kept constant at 0.7% and 3mM repectively.
  • To couple GFP to our oscillator, we added three more differential equations- \begin{align} \frac{dm_a}{dt} &= \frac{n_a b_a (1+\alpha c_a a^2(t - \tau_a))}{(1 + c_a a^2(t - \tau_a))(1+ 2c_r r^4(t - \tau_r) + {c_r}^2 r^8(t - \tau_r))} - \gamma_{m_a} m_g \\ \frac{dg_{uf}}{dt} &= t_g m_g - k_{fa} g_{uf} - \gamma_{g_{uf}} g_{uf} \\ \frac{da}{dt} &= k_{fg} g_{uf} - k_g g \\ \end{align} For GFP we have taken following values- $$ \begin{array}{|c|c|c|c|c|} \hline \text{Sr.} & \text{Parameter} & \text{Description} & \text{value} & \text{units} \\ \hline 1 & k_{g} & \text{Rate of degradation of GFP} & 0.18 & min^{-1} \\ \hline 2 & k_{fg} & \text{Rate of folding of GFP} & 0.9 & min^{-1} \\ \hline 3 & t_{g} & \text{Rate of transcription of GFP} & 200 & min^{-1} \\ \hline \end{array} $$

Results

As can be seen from fig.1, our model gives stable oscillations in the concentration of AraC and LacI proteins. The period of the oscillations follows a curve given by fig.2, which matches well with the curve reported by Stricker et al.(2008).

Figure 1: Oscillations in AraC concentrations.

Figure 2: change in time period vs IPTG concentrations.



Our model predicts that we will still get oscllations if the plasmid copy number is same for $n_a = n_r = 50$. However, their Amplitude is less compared to the case of $n_a = 50, n_r = 25$. Specifically, the amplitude in AraC concentration is $1.7 \times 10^3$ molecules per cell as compared to $1.3 \times 10^4$ molecules in case of 1:2 copy number ratio.

When we change the RBS efficiency to 5:3, our model gives oscillations with the amplitude of $0.95 \times 10^4$ molecules

Figure 3:Equal copy number dampens the oscillations. Change in RBS efficiency restores the high amplitude

We obtained oscillations in GFP concentration as well.

Figure 4: Oscillations in GFP coupled to the oscillator



References

  1. Hasty, J., Dolnik, M., Rottschäfer, V., and Collins, J. J. (2002). Synthetic gene network for entraining and amplifying cellular oscillations.Phys. Rev. Lett., 88:148101
  2. Stricker, J., Cookson, S., Bennett, M. R., Mather, W. H., Tsimring,L. S., and Hasty, J. (2008). A fast, robust and tunable synthetic gene oscillator.Nature, 456(7221):516–519
  3. https://2013.igem.org/Team:HUST-China/Modelling/DDE_Model