Team:Lund/Modeling/ParameterEstimation

Modeling

Overview

In this section we provide the parameters used in the model. The values were estimated using tabulated kinetic parameters, iGEM registry data and experimental results from published literature. An overview can be found in table 1.

Parameters

Table 1: List of all parameters used in the model. For a description, see the corresponding description section further down.
Parameter Value Unit Description Reference
$\ce{C_\text{p}}$ $40$ - Copy number [1]
$\ce{V}$ $8 \cdot 10^{-16}$ L E. coli cell volume [2]
$\ce{k_\text{MR}}$ $0.11$ $\text{min}^{-1}$ MR transcription rate constant [2]a)
$\ce{k_\text{R}}$ $15$ $\ce{min}^{-1}$ R translation rate constant [2]
$\ce{k_\text{2R}}$ $50$ $\ce{\text{nM}^{-1} min^{-1}}$ R dimerization rate constant [2]
$\ce{k_\text{-2R}}$ $10^{-3}$ $\ce{min}^{-1}$ R$_2$ dissociation rate constant [2]
$\ce{k_\text{r}}$ $960$ $\text{nM}^{-1} \text{min}^{-1}$ Operator repression rate constant [2]
$\ce{k_\text{-r}}$ $2.4$ $\text{min}^{-1}$ Operator derepression rate constant [2]
$\ce{k_\text{dr1}}$ $3\cdot 10^{-7}$ $\text{nM}^{-2}\text{min}^{-1}$ Free R$_2$ repression rate constant [2]
$\ce{k_\text{-dr1}}$ $12$ $\text{min}^{-1}$ Free R$_2$ derepression rate constant [2]
$\ce{k_\text{dr2}}$ $3\cdot 10^{-7}$ $\text{nM}^{-2} \text{min}^{-1}$ Operator bound R$_2$ repression rate constant [2]
$\ce{k_\text{-dr2}}$ $4.8 \cdot 10^{3}$ $\text{nM}^{-1} \text{min}^{-1}$ Operator bound R$_2$ derepression rate constant [2]
$\ce{k_\text{MN}}$ $3.09$ $\text{min}^{-1}$ $\ce{M_R}$ transcription rate constant [3]b)
$\ce{k_\text{ME}}$ $3.18$ $\text{min}^{-1}$ $\ce{M_E}$ transcription rate constant [3]b)
$\ce{k_\text{MG}}$ $0.168$ $\text{min}^{-1}$ $\ce{M_G}$ transcription rate constant Estimatedg)
$\ce{k_\text{N}}$ $3.0$ $\text{min}^{-1}$ NahR translation rate constant [3]c)
$\ce{k_\text{E}}$ $2.86$ $\text{min}^{-1}$ GFP10-ER-GFP11 translation rate constant [3]c)
$\ce{k_\text{G}}$ $4.59$ $\text{min}^{-1}$ GFP1-9 translation rate constant [3]c)
$\ce{k_\text{aEEs}}$ $0.078$ $\text{nM}^{-1}\text{min}^{-1}$ GFP10-ER-GFP11, estrogen association rate constant [4]e)
$\ce{k_\text{dEEs}}$ $0.072$ $\text{min}^{-1}$ GFP10-ER-GFP11, estrogen dissociation rate constant [4]e)
$\ce{k_\text{aNP}}$ $4.7\cdot 10^{-3}$ $\text{nM}^{-1}\text{min}^{-1}$ NahR-Psal association rate constant [5]f)
$\ce{k_\text{dNP}}$ $11.46$ $\text{min}^{-1}$ NahR-Psal dissociation rate constant [5]f)
$\ce{k_\text{aSNP}}$ $4.61 \cdot 10^{-4}$ $\text{nM}^{-1}\text{min}^{-1}$ Salicylate-[NahR-Psal] association rate constant Estimatedg)
$\ce{k_\text{dSNP}}$ $3.81 \cdot 10^{-6}$ $\text{min}^{-1}$ Salicylate-[NahR-Psal] dissociation rate constant Estimatedg)
$\ce{k_\text{aGFP}}$ $1.8 \cdot 10^{-5}$ $\ce{\text{n}M^{-1}min^{-1}}$ Tripartite GFP association rate constant [6]h)
$\ce{k_\text{dGFP}}$ $0.0$ $\ce{min^{-1}}$ Tripartite GFP dissociation rate constant [6]h)
$\ce{k_\text{dI}}$ $0.92$ $\text{min}^{-1}$ IPTG diffusion constant [2]
$\ce{k_\text{dS}}$ $0.281$ $\text{min}^{-1}$ Salicylate diffusion rate constant [7]i)
$\ce{k_\text{dE}}$ $1.44$ $\text{min}^{-1}$ Estrogen diffusion rate constant [8]j)
$\ce{\lambda_\text{MR}}$ $0.462$ $\ce{min^{-1}}$ MR degradation rate constant [2]
$\ce{\lambda_\text{MN}}$ $0.2$ $\text{min}^{-1}$ $\ce{M_N}$ degradation rate constant [3]k)
$\ce{\lambda_\text{ME}}$ $0.2$ $\text{min}^{-1}$ $\ce{M_E}$ degradation rate constant [3]k)
$\ce{\lambda_\text{MG}}$ $0.2$ $\text{min}^{-1}$ $\ce{M_G}$ degradation rate constant [3]k)
$\ce{\lambda_\text{R}}$ $0.2$ $\ce{min}^{-1}$ R degradation rate constant [2]
$\ce{\lambda_{\text{R}_2}}$ $0.2$ $\ce{min}^{-1}$ R$_2$ degradation rate constant [2]
$\ce{\lambda_{\text{I}_2\text{R}_2}}$ $0.2$ $\ce{min}^{-1}$ I$_2$R$_2$ degradation rate constant [2]
$\ce{\lambda_\text{N}}$ $0.00578$ $\text{min}^{-1}$ NahR degradation rate constant Assumedl)
$\ce{\lambda_\text{E}}$ $0.00578$ $\text{min}^{-1}$ GFP10-ER-GFP11 degradation rate constant [3, 9]m)
$\ce{\lambda_\text{EEs}}$ $0.00578$ $\text{min}^{-1}$ Ligand bound GFP10-ER-GFP11 degradation rate constant [3, 9]m)
$\ce{\lambda_\text{G}}$ $0.0347$ $\text{min}^{-1}$ GFP1-9 degradation rate constant [Bionumbers]n)
$\ce{\lambda_\text{GFP}}$ $0.0347$ $\text{min}^{-1}$ GFP degradation rate constant [Bionumbers]n)

Description

a) The rate constant is 0.23 nM/min for an operator content of 2.08 nM [2]. This yields the rate constant $\ce{k_\text{MR} = 0.11}$ min-1

b) The average transcription rate is 50 nucleotides per second in E. coli [3]. The NahR transcript has approximately 900 nucleotides which yields $\ce{k_\text{MN}=3.33}$ min-1. The GFP10-ER-GFP11 transcript has approximately 1800 nucleotides which yields $\ce{k_\text{ME}=1.67}$ min-1. The GFP1-9 transcript has approximately 600 nucleotides which yields $\ce{k_\text{MG} = 5}$ min-1.

c) The average translation rate is 15 amino acids per second in E. coli [3]. NahR has approximately 300 amino acids which yields $\ce{k_\text{MN}=3}$ min-1. GFP10-ER-GFP11 has approximately 600 amino acids which yields $\ce{k_\text{ME}=1.5}$ min-1. The GFP1-9 transcript has approximately 200 amino acids which yields $\ce{k_\text{MG}=4.5}$ min-1.

e) The rate constants were taken for 17β-estradiol [4].

f) The NahR-Psal association and dissociation rates were taken as $k_\text{aNP} = 4.7 \cdot 10^{-3}$ nM-1min-1 and $k_\text{dNP}=11.46$ min-1 after unit conversions [5].

g) A dose-response scheme for the salicylate inducible promoter was taken from the iGEM parts registry documentation page, see figure 13. In the experimental setup, eight doses of salicylate were added every 30 minutes, activating the transcription of repressed GFP. The response was measured as an induction ratio of the fluorescence with respect to the fluorescence at the uninduced state. Fig. 1 illustrates the dose-response curve.

Figure 1: Dose response scheme for the NahR system taken from the iGEM parts registry documentation page. The data points are represented as red dots linearly interpolated with black lines.

Implementing the entire system with all parameters except for $\text{k}_\text{MG}$, $\text{k}_\text{aSNP}$ and $\text{k}_\text{dSNP}$ (see Model derivation for the equations and Implementation for implementation details), we fit the model to the dose-response scheme to obtain the parameters. In the setup we assume an induction base level of 100 nM which corresponds to approximately 100 proteins per cell at the uninduced state and assume a linearity in the concentration with respect to the induction ratio. We further assume that a maximal induction ratio is obtained at a dose of 0.001 M salicylate.

Doses of salicylate were added every 30 min and the parameters were fit as follows: we defined a penalty function $f(\boldsymbol{\theta})=f(k_\text{aSNP}, k_\text{dSNP}, k_\text{MG})$ corresponding to the root mean square difference between the models output of folded GFP1-9 and the corresponding response data. This function is then a measure of how good the model fits the measured data with respect to the three paramters $k_\text{aSNP}$, $k_\text{dSNP}$, and $k_\text{MG}$ given that all other parameters are set. The objective is then to minimize the penalty $f(\theta)$ under the stated conditions. This was done in MATLAB 2017a using LSQNONLIN which is a nonlinear least squares solver which uses the Levenberg-Marquardt algorithm. Fig. 2 shows the resulting fit and one can see that the model manages to catch the structure from the dose-response data. The resulting parameters were estimated as $\text{k}_\text{aSNP} = 4.61 \cdot 10^{-4}$ nm-1min-1, $\text{k}_\text{dSNP} = 3.81 \cdot 10^{-6}$ min-1 and $\text{k}_\text{MG} = 0.168$ min-1.

Figure 2: Fitting the model to the dose response scheme for the NahR system taken from the iGEM parts registry documentation page. As indicated by the figure, the model manages to capture the sigmoidal appeareance.

Regarding the validity of this estimation, it might seem ambiguous due to the ansatz of 100 nM induction base level. However, at the maximal induction level, this yields an expression of approximately 9000 nM of protein per cell. Assuming a molecular weight of 30kDa (a GFP has about 27 kDa) and a cell concentration of $2 \cdot 10^9$ cells per milliliter for an overnight culture [3], this would correspond to a yield of $0.432$ µg/mL, which is regarded as reasonable. In addition, the transcription rate constant was estimated as $\text{k}_\text{MG}=0.168$ min-1 which is slightly higher than the transcription rate $\text{k}_\text{MR}$ for the lac repressor but lower than the ones estimated for NahR and GFP10-ER-GFP11. This indicates that although the exact value might not be correct as the proposed model is coarse, it is likely to at least lie within the right order of magnitude.

To conclude, the ad hoc proposition of 100 nM base induction level seem reasonable as it yields plausible expression levels. In addition, the estimated transcription rate of the GFP1-9 transcript conforms with the other transcription rate constants. Finally, the model managed to fit the sigmoidal shape indicating that the reaction scheme proposed describe the system well.

h) In a paper from 2013, a tripartite split GFP-association was studied by complementing GFP1-9 with GFP10-SR-GFP11 (SR = sulfite reductase) fusion protein [6]. The construct was studied in vitro under equimolar amounts of GFP1-9 and GFP10-SR-GFP11 to yield a time-intensity fluorescence plot. We chose to estimate the association and dissociation rate parameters from the experiments carried out. Since the reactions were carried out in vitro, we assume that the time frame in which the measurement was carried out (12 hours) is too short for the proteins to decompose. In addition, we assume that once a GFP has been associated, it cannot dissociate back into it's respective fragments (this is reasonable as one does not expect a GFP under mild conditions to spontaneously dissociate into three parts). Also assuming that no side reactions occur and that everything that can react eventually reacts, the following relation for the concentration $c(t)$ of the complemented protein was derived (see appendix), $$c(t) = \begin{cases} G_0 - \frac{G_0}{1 + G_0kt}, & \mbox{if } G_0=E_0 \\ \frac{G_0E_0 - G_0E_0e^{k(G_0-E_0)t}}{E_0 - G_0e^{k(G_0-E_0)t}}, & \mbox{if } G_0\ne E_0 \end{cases} \tag{1}$$ where $G_0$ and $E_0$ are the concentrations of GFP1-9 and GFP10-SR-GFP11 at time $t=0$. At figure 3 in the article, a steady state defined as 95 % of the asymptotic concentration is reached after approximately 9 hours. As equimolar amounts of proteins were used in the paper ($G_0 = E_0 = 3.5$ µM), we get the expression $$0.95G_0 = G_0 - \frac{G_0}{1 + 9G_0k}\tag{2}$$ which yields $\ce{k_\text{aGFP}=1.0 \cdot 10^{-5}}$ nM-1min-1.

i) The constant was estimated from the permeability $p$ using the relation $k = p A / V$ where $A$ is the surface area and $V$ the cell volume. We assume that the cell is spherical, yielding the constant $\ce{k_\text{dS}=0.281}$ min-1 from the permeability $\ce{p=9 \cdot 10^{-8}}$ cm s-1 [7].

j) The constant was estimated from the diffusion constant $D$ using the relation $k = D/A$ where $A$ is the surface area. Assuming that the cell is spherical, we obtained $\ce{k_\text{dE}=1.44}$ min-1 from the diffusion constant $\ce{D=10^{7}}$ Å2/s. [8]

k) We set the lifetime of the mRNAs to 10 min, which yields the degradation rate $\lambda=0.2$ [3].

l) Protein degardation rates in E. coli vary a lot [3] and to our knowledge there exist no estimated value of the NahR degradation rate. We choose to set the half-life of NahR to 2 hours so that it takes a similar degradation rate as the other species in the model.

m) The half-life for the estrogen receptor ligand binding domain in mammalian cells has been shown to be more than 3 hours in the unbound state and 1-3 hours for the ligand bound state. Usually, protein lifetimes in E. coli cells vary a lot [3] and in the absence of information and experimental data we chose a half-life of 2 hours for both the ligand bound and unbound state.

n) The half-life of GFPs in E. coli range from a couple of minutes to hours [Bionumbers]. We set the half-life to 20 minutes.

Initial values

As the solutions to first order ordinary differential equations are not uniquely quantified unless proper initial conditions is provided, these must be specified prior to solving the system. For our system we have chosen to set the concentrations of all species to zero except for the gene content which is set at 83.2 nM. This can be calculated by the relation $c = \frac{\text{C}_\text{P}}{\text{N}_\text{A}V}$ where $\text{N}_\text{A}$ is Avogadros number, $\text{C}_\text{P}$ the copy number and $V$ the cell volume. The latter two parameters can be found in table 1.

Math stuff

To derive relation (1) we start by setting up the differential equations. We have that two species $a$ and $b$ can react irreversibly to yield $c$. We get the following reactions: $$\ce{a + b ->[k] c}\tag{3}$$ $$\ce{a ->[\lambda_1] \phi}\tag{4}$$ $$\ce{b ->[\lambda_2] \phi}\tag{5}$$ $$\ce{c ->[\lambda_3] \phi}\tag{6}$$ where $\text{a}$ corresponds to the GFP1-9 fragment and $\text{b}$ to GFP10-SR-GFP11 (SR = sulfite reductase). The product $\text{c}$ is the associated GFP. Since the reactions were carried out in vitro, we assumed that no degradation occur, i.e $\lambda_i = 0$. By balancing the masses we get the following equations: $$\frac{dc}{dt} = -\frac{db}{dt} = -\frac{da}{dt} = kab\tag{7}$$ These can be reduced into a single equation by noticing that $\frac{dc}{dt} + \frac{db}{dt} = 0$ and $\frac{dc}{dt} + \frac{da}{dt} = 0$. It can now be seen that the sum of $c$ and $b$, and $c$ and $a$ is constant. These relations can be solved to yield $$b = b_0 - c\tag{8}$$ $$a = a_0 - c\tag{9}$$ $$\frac{dc}{dt} = k(a_0-c)(b_0-c)\tag{10}$$ Here, $a_0$ and $b_0$ are the initial concentrations of $a$ and $b$. This is a separable differential equation which can be solved by separating the time and concentration terms. There are two cases, either $a_0 = b_0$ or $a_0 \neq b_0$. Starting with the former case, one gets $$\frac{dc}{dt} = k(a_0-c)^2\tag{11}$$ which can be separated and solved via the following operations: $$\frac{dc}{k(a_0-c)^2} = dt\tag{12}$$ $$\int_{0}^{c} \frac{dx}{k(a_0-x)^2} = \int_{0}^{t}dx\tag{13}$$ $$\frac{1}{k}(\frac{1}{a_0-c}-\frac{1}{a_0}) = t\tag{14}$$ $$c(t) = a_0 - \frac{a_0}{a_0kt+1}\tag{15}$$ where we have set $c(0)=0$. Similarly, one can solve the case $a_0=b_0$ by separating the varibles, partial fraction decompose the expression and solving the integrals. One then yields: $$c(t) = \frac{a_0b_0 - a_0b_0e^{k(a_0-b_0)t}}{b_0 - a_0e^{k(a_0-b_0)t}}\tag{16}$$ The entire solution to the problem is then: $$c(t) = \begin{cases} a_0 - \frac{a_0}{1 + a_0kt}, & \mbox{if } a_0=b_0 \\ \frac{a_0b_0 - a_0b_0e^{k(a_0-b_0)t}}{b_0 - a_0e^{k(a_0-b_0)t}}, & \mbox{if } a_0\ne b_0 \end{cases}\tag{17}$$

References

  1. [1] Merck Millipore pETDuet-1 docuementation. Link here
  2. [2] Michaeil Stamatakis and Nikos V. Mantzaris. "Comparision of Determinstic and Stochastic Models of the lac Operon Genetic Network". Biophysical Journal Volume 96. Feb 2009 887-906
  3. [3] Milo, R., Phillips, R. and Orme, N. (2016). Cell biology by the numbers. New York, NY: Garland Science.
  4. [4] Rebecca L. Rich, a, Lise R. Hoth, a, Kieran F. Geoghegan, a, Thomas A. Brown, a, Peter K. LeMotte, a, Samuel P. Simons, a, Preston Hensley, a, & David G. Myszka, a 2002, 'Kinetic Analysis of Estrogen Receptor/Ligand Interactions', Proceedings Of The National Academy Of Sciences Of The United States Of America, 13, p. 8562.
  5. [5] Park, H., Lim, W. and Shin, H. (2005). In vitro binding of purified NahR regulatory protein with promoter Psal. Biochimica et Biophysica Acta (BBA) - General Subjects, 1725(2), pp.247-255.
  6. [6] Cabantous, S., Nguyen, H., Pedelacq, J., Koraïchi, F., Chaudhary, A., Ganguly, K., Lockard, M., Favre, G., Terwilliger, T. and Waldo, G. (2013). A New Protein-Protein Interaction Sensor Based on Tripartite Split-GFP Association. Scientific Reports, 3(1).
  7. [7] John Gutknecht. "Aspirin, acetaminophen and proton transport through phospholipid bilayers and mitochondrial membranes". Molecular and Cellular Biochemistry 114: 3-8, 1992
  8. [8] Idit Oren, Sarel J. Fleishman, Amit Kessel, Nir Ben-Tal. "Free Diffusion of Steroid Hormones Across Biomembranes: A Simplex Search with Implicit Solvent Model Calculations". Biophysical Journal Vol. 87, August 2004, p. 768-779.
  9. [9] Kocanova, S., Mazaheri, M., Caze-Subra, S. and Bystricky, K. (2010). Ligands specify estrogen receptor alpha nuclear localization and degradation. BMC Cell Biology, 11(1), p.98.
  10. [Bionumbers] The bionumbers databse. Milo et al. Nucl. Acids Res. (2010) 38 (suppl 1): D750-D753