# Modeling

### Introduction

In this section we derive a model for the whole cell biosensor based on a system of ordinary differential equations. We choose to model one single cell deterministic and consider the dynamics of larger cell populations the same as the dynamics of a single cell, but scaled to the population size. This is reasonable although stochastic effects are present at the cellular level if one assumes that each cell is independent. The choice of modeling can then be justified by the law of large numbers: if there is a large number of identical and independent random variables (in this case cells), their average behaviour tend to converge to an expected behaviour. Thus, as the expected behaviour is obtained from the deterministic scheme, this properly reflects our biosensor system as we can assume a very large population.

In addition, we present our reasoning behind the reactions and the corresponding assumptions. We start by modeling the regulation of the pETDuet-1 plasmid which consists of one gene encoding the lac repressor and two lac regulated genes which constitute the respective pathway in our AND-logic. The biomolecular network is then presented for each pathway followed by the final GFP complementation. At the end, a deterministic scheme representing the system in terms of ordinary differential equations is presented along with the associated assumptions and boundary conditions.

### Overview

A simplified description of the biochemical network for the implemented whole cell biosensor in the pETDuet-1 plasmid is illustrated in fig. 1. The vector has one lac operator regulating each T7-promoter and one corresponding lacI which constituitively expresses the lac repressor for both operators. Upon IPTG induction, the proteins NahR and GFP10-ER-GFP11 are expressed. These correspond to the respective pathway in the AND-logic. The former protein, NahR, is a transcription factor which binds to the Psal promoter located upstreams of GFP1-9. This activates the transcription of GFP1-9 under the presence of salicylate, which serves as the first input into the gate. The GFP10-ER-GFP11 fusion protein can undergo a conformational change in the presence of estrogen, the second input. This brings the GFP10 and GFP11 linker bound fragments into close proximity to allow for a subsequent association with GFP1-9. This complementation yields a fully restored chromophore which can be detected by fluorometric methods, consequently closing the circuit. See the design section for more information.

### lac regulation

The system starts with an IPTG induction of two lac operators. We choose to model this in a greater detail than the standard Michaelis-Menten scheme as we now have two operators being repressed by one lacI. Thus, we have chosen to include the repressor dynamics with the operators but also chosen an intermediate route where we dont take into account of leak transcriptions or other second order effects. Here we provide a description of the derivation based on a study by M. Stamatakis and N. V. Mantzaris who used mass action and stochastic kinetic models to study the autoregulation and bistability of the lac operon [2].

Starting with the transcriptional pathway of the lac repressor, the lacI gene $\ce{O_R}$ which codes for the dimeric monomer $\ce{R}$ is first transcribed to yield the mRNA $\ce{M_R}$. The transcript is then translated into the protein $R$ which dimerises into the repressor $R_2$ in a reversible reaction.

$$\ce{O_R ->[k_\text{MR}] O_R + M_R}\tag{1}$$ $$\ce{M_R ->[k_\text{R}] M_R + R}\tag{2}$$ $$\ce{2R <-->[k_\text{2R}][k_\text{-2R}] R_2}\tag{3}$$

The repressor can subsequently bind to the operators $\ce{O_N}$ and $\ce{O_E}$ which regulates the expression of NahR and GFP10-ER-GFP11 to yield the corresponding repressed operators $\ce{R_2O_N}$ and $\ce{R_2O_E}$. This is modeled as two reversible reactions where the repressed operator can dissociate back into the repressor and free operator. Note that the reaction rates are equal for both equilibriums as we assume that the dynamics is the same for both operators.

$$\ce{R_2 + O_N <-->[k_r][k_{-r}] R_2O_N}\tag{4}$$ $$\ce{R_2 + O_E <-->[k_r][k_{-r}] R_2O_E}\tag{5}$$

The induction of the repressed operons proceeds via the addition of an inducer $\ce{I}$ which binds to the repressor, allowing transcription to proceed. This is modeled as a third order reaction where two entities of $\ce{I}$ binds to $\ce{R_2}$ to yield $\ce{I_2R_2}$. The reaction can take place between both free and operator bound repressors and the complex $\ce{I_2R_2}$ can dissociate back and rerepress the operators.

$$\ce{2I + R_2 <-->[k_{dr1}][k_{-dr1}] I_2R_2}\tag{6}$$ $$\ce{2I + R_2O_N <-->[k_{dr2}][k_{-dr2}] I_2R_2 + O_N}\tag{7}$$ $$\ce{2I + R_2O_E <-->[k_{dr2}][k_{-dr2}] I_2R_2 + O_E}\tag{8}$$

The inducer $\ce{I}$ is distinguished from the extracellular inducer $\ce{I_{ex}}$. The flux over the cell membrane is modeled as free diffusion.

$$\ce{I_{ex} <-->[k_{dI}][k_{dI}] I}\tag{9}$$

The transcription and translations are modeled as catalytic first order reactions. The genes regulated by $\ce{O_N}$ and $\ce{O_E}$ yield the NahR and GFP10-ER-GFP11 transcripts $\ce{M_N}$ and $\ce{M_E}$ respectively. These are further translated into the respective proteins, $\ce{N}$ and $\ce{E}$.

$$\ce{O_N ->[k_\text{MN}] O_N + M_N}\tag{10}$$ $$\ce{O_E ->[k_\text{ME}] O_E + M_E}\tag{11}$$ $$\ce{M_N ->[k_\text{N}] M_N + N}\tag{12}$$ $$\ce{M_E ->[k_\text{E}] M_E + E}\tag{13}$$

All degradations are modeled as first order reactions. It is assumed that all mRNA:s and proteins degrade with the exception of the operator bound repressor. Note that it is also assumed that the inducer does not decompose.

$$\ce{M_R ->[\lambda_\text{MR}] \phi}\tag{14}$$ $$\ce{M_N ->[\lambda_\text{MN}] \phi}\tag{15}$$ $$\ce{M_E ->[\lambda_\text{ME}] \phi}\tag{16}$$ $$\ce{R ->[\lambda_\text{R}] \phi}\tag{17}$$ $$\ce{R_2 ->[\lambda_{\text{R}_2}] \phi}\tag{18}$$ $$\ce{N ->[\lambda_\text{N}] \phi}\tag{19}$$ $$\ce{E ->[\lambda_\text{E}] \phi}\tag{20}$$ $$\ce{I_2R_2 ->[\lambda_{\text{I}_2\text{R}_2}] 2I}\tag{21}$$

### NahR

It has been shown that NahR binds to the Psal promoter as a transcription factor regardless of the presence of salicylate. The binding affinity has previously been thought to be unaffected of salicylate, but recent in vitro studies with surface plasmon resonance indicate that it in fact has a slight inhibiting effect on the binding to Psal. There has also been indications that NahR binds as a tetramer to the promoter. However, it has also recently been shown that NahR binds sequentially as a monomer to Psal and then undergoes multimerization on site. The degree of multimerization is to our knowledge not known and seems to vary with the concentration of NahR. It has been shown that immobilized Psal can bind up to eight monomers for high concentrations of NahR (> 6.8 µM), while at low concentrations (< 0.42 µM) only one monomer is bound [1]. Furthermore, many aspects of the underlying dynamics remains unclear. For example, we do not know whether one salicylate has to bind to each NahR before transcription can occur or whether the transcription rate is cumulative. Neither do we know how the transcription factor affects the promoter with respect to the amount of monomers bound. Taking all the intermediate steps into account in the model would lead to a complicated system with many species, yielding a large set of equations with a big parameter space. Consequently, as the amount of data needed to estimate a model increases exponentially with the amount of parameters, we choose to model the regulation in two steps.

The reactions proceed by NahR binding to Psal followed by salicylate binding to the NahR-Psal complex in two reversible second order reactions, allowing subsequent transcription of GFP1-9. Thus, we have chosen to not take into account of the effect of salicylate on the binding of NahR to Psal. Furthermore, we have not taken into account of the binding stoichiometries or possible transcriptionally active intermediates and instead estimated the reactions as being of second order between the fully active and inactive states. Worth noting is that despite these assumptions, the model manages to fit experimental dose-response data obtained from the iGEM parts registry page, see the parameter estimation page. Thus, the assumptions seem sufficient, at least for a high level description. By denoting the promoter as $\ce{\text{Psal}}$, salicylate as $\ce{S}$ and the induced promoter $\ce{\text{SNPsal}}$, we get the following set of reactions:

$$\ce{N + P\text{sal} <-->[k_\text{aNP}][k_\text{dNP}] NP\text{sal}}\tag{22}$$ $$\ce{S + NPsal <-->[k_\text{aSNP}][k_\text{dSNP}] SNPsal}\tag{23}$$

Once the the promoter is induced, the transciption of the GFP1-9 can start. The transcript is denoted as $\text{M}_\text{G}$. This is followed by a translation which yields the GFP1-9 fragment denoted as $\text{G}$. $$\ce{SNPsal ->[k_\text{MG}] SNPsal + M_\text{G}}\tag{24}$$ $$\ce{M_\text{G} ->[k_\text{G}] M_\text{G} + G}\tag{25}$$

In addition, we distinguish intracellular salicylate from extracellular and model the flux as free difussion over the cell membrane.

$$\ce{S_\text{ex} <-->[k_{dS}][k_{dS}]S}\tag{26}$$

All degradations are of first order and we assume that salicylate does not decompose.

$$\ce{M_G ->[\lambda_\text{MG}] \phi}\tag{27}$$ $$\ce{G ->[\lambda_\text{G}] \phi}\tag{28}$$

### GFP10-ER-GFP11

We assume equivalency between the kinetic mechanism of antagonistic and agonistic binding to the estrogen receptor. For sake of simplicity, antiestrogen is used interchangeably with estrogen throughout the text. The binding of estrogen to the GFP10-ER-GFP11 fusion protein is modeled as a reversible second order reaction where a ligand bound fusion protein is generated. By denoting estrogen as $\ce{Es}$ and the ligand bound protein $\ce{EEs}$, we get the following reaction:

$$\ce{E + Es <-->[k_{aEEs}][k_{dEEs}] EE\text{s}}\tag{29}$$

We distinguish intracellular estrogen from extracellular and model the flux as free diffusion over the cell membrane. In addition, the ligand bound estrogen receptor fusion protein degrades in a first order reaction (note that we already defined the degradation for $\ce{E}$).

$$\ce{E\text{s} <-->[k_\text{dEs}][k_\text{dEs}] E\text{s}_\text{ex}}\tag{30}$$ $$\ce{EE\text{s} ->[\lambda_E] \phi}\tag{31}$$

### GFP association

There are many ways in which one could model the tripartite association. For example, one could imagine the GFP10 and GFP11 fragments first establishing a hairpin loop prior to complementing with GFP1-9. Another mechanism would be that one of the estrogen receptor bound fragments first binds to GFP1-9 followed by an association of the second receptor bound fragment. Nevertheless, we choose to model the association as a direct bimolecular complementation as it allowed us to estimate the association and dissocation rate constants with ease due to the simple kinetic expression, see the parameter estimation section for more details. We thus have the GFP1-9 and active GFP10-ER-GFP11 associating to yield a restored GFP denoted as $\ce{GFP}$. As before, the protein decomposes in a first order reaction.

$$\ce{G + EE\text{s} <-->[k_\text{aGFP}][k_\text{dGFP}] GFP}\tag{32}$$ $$\ce{GFP ->[\lambda_\text{GFP}] \phi}\tag{33}$$

Table 1: List of species
Notation Description
$\ce{O_\text{R}}$ lacI gene
$\ce{M_\text{R}}$ LacI mRNA
$\ce{R}$ LacI monomer
$\ce{R_2}$ lac repressor
$\ce{O_\text{N}}$ lac operator regulating NahR
$\ce{O_\text{E}}$ lac operator regulating GFP10-ER-GFP11
$\ce{R_2O_\text{N}}$ Repressed lac operator regulating NahR
$\ce{R_2O_\text{E}}$ Repressed lac operator regulating GFP10-ER-GFP11
$\ce{I_\text{ex}}$ Extracellular IPTG
$\ce{I}$ Intracellular IPTG
$\ce{I_2R_2}$ Inhibited lac repressor
$\ce{M_\text{N}}$ NahR mRNA
$\ce{M_\text{E}}$ GFP10-ER-GFP11 mRNA
$\ce{N}$ NahR
$\ce{E}$ GFP10-ER-GFP11
$\ce{P\text{sal}}$ Psal promoter
$\ce{NP\text{sal}}$ NahR-Psal complex
$\ce{SNP\text{sal}}$ Salicylate-NahR-Psal complex
$\ce{S}$ Intracellular salicylate
$\ce{S_\text{ex}}$ Extracellular salicylate
$\ce{M_\text{G}}$ GFP1-9 mRNA
$\ce{G}$ Matured GFP1-9
$\ce{EEs}$ Salicylate bound GFP10-ER-GFP11
$\ce{Es}$ Intracellular estrogen
$\ce{Es_\text{ex}}$ Extracellular estrogen
$\ce{GFP}$ Complemented GFP

### Determinstic equations

For the sake of completeness, we present a determinstic representation of the network in terms of ordinary differential equations. It is assumed that all reactions follow mass action kinetics, that there is no cellular net growth and that each cell is chemically homogeneous and well stirred. In addition, we assume that the cells do not affect their environment so the concentrations of all extracellular species can be considered as constant boundary conditions. The reaction rates of (1)-(33) are then given by:

$$v_1 = k_{MR}O_R \tag{34}$$ $$v_2 = k_RM_R\tag{35}$$ $$v_{3,f} = k_{2R}R^2 \tag{36}$$ $$v_{3,r} = k_{-2R}R_2 \tag{37}$$ $$v_{4,f} = k_rR_2\cdot O_N \tag{38}$$ $$v_{4,r} = k_{-r}R_2O_N\tag{39}$$ $$v_{5,f} = k_rR_2\cdot O_E\tag{40}$$ $$v_{5,r} = k_{-r}R_2O_E\tag{41}$$ $$v_{6,f} = k_{dr1}I^2\cdot R_2\tag{42}$$ $$v_{6,r} = k_{-dr1}I_2R_2\tag{43}$$ $$v_{7,f} = k_{dr2}I^2\cdot R_2O_N\tag{44}$$ $$v_{7,r} = k_{-dr2}I_2R_2 \cdot O_N\tag{45}$$ $$v_{8,f} = k_{dr2}I^2\cdot R_2O_E\tag{46}$$ $$v_{8,r} = k_{-dr2}I_2R_2 \cdot O_E\tag{47}$$ $$v_{9,f} = k_{dI}I_{ex}\tag{48}$$ $$v_{9,r} = k_{dI}I\tag{49}$$ $$v_{10} = k_{MN}O_N\tag{50}$$ $$v_{11} = k_{ME}O_E\tag{51}$$ $$v_{12} = k_{N}M_N\tag{52}$$ $$v_{13} = k_{E}M_E\tag{53}$$ $$v_{14} = \lambda_{MR}M_R\tag{54}$$ $$v_{15} = \lambda_{MN}M_N\tag{55}$$ $$v_{16} = \lambda_{ME}M_E\tag{56}$$ $$v_{17} = \lambda_{R}R\tag{57}$$ $$v_{18} = \lambda_{R_2}R_2\tag{58}$$ $$v_{19} = \lambda_{N}N\tag{59}$$ $$v_{20} = \lambda_{E}E\tag{60}$$ $$v_{21} = \lambda_{I_2R_2}I_2R_2 \tag{61}$$ $$v_{22,f} = k_{aNP}N\cdot Psal\tag{62}$$ $$v_{22,r} = k_{dNP}NPsal\tag{63}$$ $$v_{23,f} = k_{aSNP}S\cdot NPsal\tag{64}$$ $$v_{23,r} = k_{dSNP}SNPsal\tag{65}$$ $$v_{24} = k_{MG}SNPsal\tag{66}$$ $$v_{25} = k_{G}M_G\tag{67}$$ $$v_{26,f} = k_{dS}S_{ex}\tag{68}$$ $$v_{26,r} = k_{dS}S\tag{69}$$ $$v_{27} = \lambda_{MG}M_G\tag{70}$$ $$v_{28} = \lambda_GG\tag{71}$$ $$v_{29,f} = k_{aEEs}E\cdot Es\tag{72}$$ $$v_{29,r} = k_{aEEs}EEs\tag{74}$$ $$v_{30,f} = k_{dEs}Es\tag{75}$$ $$v_{30,r} = k_{dEs}Es_{ex}\tag{75}$$ $$v_{31} = \lambda_EEEs\tag{76}$$ $$v_{32,f} = k_{aGFP}G\cdot EEs\tag{77}$$ $$v_{32,r} = k_{dGFP}GFP\tag{78}$$ $$v_{33} = \lambda_{GFP}GFP\tag{79}$$

By balancing the masses, we get the set of differential equations (80-105). Note that the rate equations for the extracellular inducers are set to zero, as they are considered as boundary conditions under the assumption that the cell does not affect it's environment.

$$\frac{dO_R}{dt} = 0 \tag{80}$$ $$\frac{dM_R}{dt} = v_1 - v_{14} \tag{81}$$ $$\frac{dR}{dt} = v_2 - 2v_{3,f} + 2v_{3,r} - v_{17} \tag{82}$$ $$\frac{dR_2}{dt} = v_{3,f} - v_{3,r} - v_{4,f} + v_{4,r} - v_{5,f} + v_{5,r} - v_{6,f} + v_{6,r} - v_{18} \tag{83}$$ $$\frac{dO_N}{dt} = -v_{4,f} + v_{4,r} + v_{7,f} - v_{7,r} \tag{84}$$ $$\frac{dO_E}{dt} = -v_{5,f} + v_{5,r} + v_{8,f} - v_{8,r} \tag{85}$$ $$\frac{dR_2O_N}{dt} = v_{4,f} - v_{4,r} - v_{7,f} + v_{7,r} \tag{86}$$ $$\frac{dR_2O_E}{dt} = v_{5,f} - v_{5,r} - v_{8,f} + v_{8,r} \tag{87}$$ $$\frac{dI_{ex}}{dt} = 0 \tag{88}$$ $$\frac{dI}{dt} = v_{9,f} - v_{9,r} + 2v_{21} \tag{89}$$ $$\frac{dI_2R_2}{dt} = v_{6,f} - v_{6,r} + v_{7,f} - v_{7,r} + v_{8,f} - v_{8,r} - v_{21} \tag{90}$$ $$\frac{dM_N}{dt} = v_{10} - v_{15} \tag{91}$$ $$\frac{dM_E}{dt} = v_{11} - v_{16} \tag{92}$$ $$\frac{dN}{dt} = v_{12} - v_{19} - v_{22,f} + v_{22,r} \tag{93}$$ $$\frac{dE}{dt} = v_{13} - v_{20} - v_{29,f} + v_{29,r} \tag{94}$$ $$\frac{dPsal}{dt} = -v_{22,f} + v_{22,r} \tag{95}$$ $$\frac{dNPsal}{dt} = v_{22,f} - v_{22,r} - v_{23,f} + v_{23,r} \tag{96}$$ $$\frac{dSNPsal}{dt} = v_{23,f} - v_{23,r} \tag{97}$$ $$\frac{dS}{dt} = -v_{23,f} + v_{23,r} + v_{26,f} - v_{26,r} \tag{98}$$ $$\frac{dS_{ex}}{dt} = 0 \tag{99}$$ $$\frac{dM_G}{dt} = v_{24} - v_{27} \tag{100}$$ $$\frac{dG}{dt} = v_{25} - v_{28} - v_{32,f} + v_{32,r} \tag{101}$$ $$\frac{dEEs}{dt} = v_{29,f} - v_{29,r} - v_{31} - v_{32,f} + v_{32,r} \tag{102}$$ $$\frac{dEs}{dt} = - v_{29,f} + v_{29,r} - v_{30,f} + v_{30,r} \tag{103}$$ $$\frac{dEs_ex}{dt} = 0 \tag{104}$$ $$\frac{dGFP}{dt} = v_{32,f} - v_{32,r} - v_{33} \tag{105}$$

### References

1. [1] Park, H., Lim, W. and Shin, H. (2005). In vitro binding of purified NahR regulatory protein with promoter Psal. Biochimica et Biophysica Acta (BBA) - General Subjects, 1725(2), pp.247-255.