Revision as of 08:36, 29 October 2017

The aim of modelling in synthetic biology is to simulate the behaviour of your project to gain insight into how to best improve it. For our project, we saw three levels at which modelling could aid in the pursuit of its central aims.

As has been explored on our integrated human practices and applied design pages, the problem of insulin accessibility is complex and multi-faceted. As such, we decided it was not enough to consider our project as a problem whose solution could be found solely in a test tube. Distilled down, our project can be viewed as three sequential aims which we believe together can be used to address insulin accessibility.

Our modelling efforts were split into three branches, which reflected these major aspects of our project

Difficulties optimising production of recombinant are a key issue in the state of its accessibility.

in silico experiments to simulate how best to optimise expression led to theoretical insights which informed the direction of our efforts.

It is imperative to test the feasibility of our recombinant insulin as a therapeutic for diabetics.

Modelling the effects of changes to insulin’s biochemical makeup on its therapeutic effects supplement our wet-lab efforts to characterise our molecule

In addition, the project would be moot without a consideration of the insulin market as a whole.

Modelling helped us to gain insight into the global insulin market, which informed our approach towards entrepeneurship.

Experimental

Physiological

Economic

We began our experimental modelling by using a mechanistic model of an E. Coli cell developed in [1]. The methodology behind integrating models of our expression system into this model was to more accurately reflect reality. Recombinant protein expression occurs within a complex cellular environment with finite resources. A model which ignores the actiities of the host cells would ignore important host-circuit interactions. Ignoring the finite resources of the cell may skew our prediction of the yield of our expression systems. See below for details on the whole cell model we used.

Whole Cell Model

A model of the E. Coli cell including nutrient import, its conversion to cellular energy, and the transcription and translation of four categories of proteins was developed in [1]. It builds into the model considerations of the finite levels of cellular energy, ribosomes and cell mass.

Tables 1 and 2 detail the list of reactions that were considered in the model.

Table 1: List of reactions relating to the expression and degradation of four protein species considered in whole cell model developed in [1]
Protein Species (symbol)	Dilution of protein	Transcription	Dilution/degradation of mRNA	Ribosome binding	Dilution of ribosome-bound mRNA	Translation
Ribosomes \(\color{#3e3f3f}{(r)}\)	\[ \color{#3e3f3f}{ r\xrightarrow{\lambda}\varnothing }\]	\[ \color{#3e3f3f}{ \varnothing\xrightarrow{\omega_r}m_r}\]	\[ \color{#3e3f3f}{ m_r\xrightarrow{\lambda+d_m}\varnothing}\]	\[ \color{#3e3f3f}{ r+m_r\xrightarrow{k_b, k_u}c_r}\]	\[ \color{#3e3f3f}{ c_r\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{ n_ra+c_r\xrightarrow{\upsilon_r}r+m_r+r}\]
Transporter enzyme \(\color{#3e3f3f}{(e_t)}\)	\[ \color{#3e3f3f}{ e_t\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{\varnothing\xrightarrow{\omega_r}m_t}\]	\[ \color{#3e3f3f}{ m_t\xrightarrow{\lambda+d_m}\varnothing}\]	\[ \color{#3e3f3f}{ r+m_t\xrightarrow{k_b, k_u}c_t}\]	\[ \color{#3e3f3f}{ c_t\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{ n_ta+c_t\xrightarrow{\upsilon_t}r+m_t+e_t}\]
Metabolic enzyme \(\color{#3e3f3f}{(e_m)}\)	\[ \color{#3e3f3f}{ e_m\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{ \varnothing\xrightarrow{\omega_r}m_m}\]	\[ \color{#3e3f3f}{m_m\xrightarrow{\lambda+d_m}\varnothing}\]	\[ \color{#3e3f3f}{ r+m_m\xrightarrow{k_b, k_u}c_m}\]	\[ \color{#3e3f3f}{ c_m\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{ n_ma+c_m\xrightarrow{\upsilon_m}r+m_m+e_m}\]
Growth-independent/ housekeeping proteins \(\color{#3e3f3f}{(q)}\)	\[ \color{#3e3f3f}{ q\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{ \varnothing\xrightarrow{\omega_r}m_q}\]	\[ \color{#3e3f3f}{ m_q\xrightarrow{\lambda+d_m}\varnothing}\]	\[ \color{#3e3f3f}{ r+m_q\xrightarrow{k_b, k_u}c_q}\]	\[ \color{#3e3f3f}{ c_q\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{ n_qa+c_q\xrightarrow{\upsilon_q}r+m_q+q}\]

Table 2: Nutrient metabolism and cellular energy reactions in whole cell model developed in [1]
Protein Species (symbol)	Dilution of protein	Nutrient Import	Metabolism
Internal Nutrient \(\color{#3e3f3f}{(s_i)}\)	\[ \color{#3e3f3f}{ s_i\xrightarrow{\lambda}\varnothing}\]	\[ \color{#3e3f3f}{ s\xrightarrow{\upsilon_{imp}}s_i}\]	\[ \color{#3e3f3f}{ s_i\xrightarrow{\upsilon_{cat}}n_sa}\]
ATP \(\color{#3e3f3f}{(a)}\)	\[ \color{#3e3f3f}{ a\xrightarrow{\lambda}\varnothing}\]	-	-

See table 3 for notation relating to rates and parameters in [1].

Table 3: Notation for rates in Weisse et al. model
Symbol	Meaning
\[ \color{#3e3f3f}{ \upsilon_{imp}}\]	Rate of nutrient import
\[\color{#3e3f3f}{\upsilon_{cat}}\]	Rate of nutrient metabolism
\[\color{#3e3f3f}{\lambda}\]	Growth Rate
\[\color{#3e3f3f}{n_x\textrm{ with } x\in\{r,t,m,q\}}\]	length of proteins of different species
\[\color{#3e3f3f}{\upsilon_x \textrm{ with } x\in\{r,t,m,q\}}\]	Rate of translating protein species'
\[\color{#3e3f3f}{k_b}\]	mRNA ribosome binding rate
\[\color{#3e3f3f}{k_u}\]	mRNA ribosome unbinding rate
\[\color{#3e3f3f}{\omega_x \textrm{ with } x\in\{r,t,m,q\}}\]	Transcription rates of the four species of proteins
\[\color{#3e3f3f}{d_m}\]	mRNA degradation rate

A system of 14 differential equations were derived from these reactions. \[ \color{#3e3f3f}{ \frac{d}{dt}s_i=\upsilon_{imp} (e_t,s)-\upsilon_{cat}(e_m,s_i)-\lambda s_i}\] \[ \color{#3e3f3f}{ \frac{d}{dt}a=n_s\cdot\upsilon_{cat}(e_m,s_i)-\sum_{x\in\{r,t,m,q\}}n_x\upsilon_x(c_x,a)-\lambda a}\] \[ \color{#3e3f3f}{ \frac{d}{dt}r=\upsilon_r(c_r,a)-\lambda r+\sum_{x\in\{r,t,m,q\}} (\upsilon_x(c_x,a)-k_brm_x+k_uc_x)}\] \[ \color{#3e3f3f}{ \frac{d}{dt}e_r=\upsilon_t(c_t,a)-\lambda e_t}\] \[ \color{#3e3f3f}{ \frac{d}{dt}e_m=\upsilon_m(c_m,a)-\lambda e_m}\] \[ \color{#3e3f3f}{ \frac{d}{dt}q=\upsilon_q(c_q,a)-\lambda q}\] \[ \color{#3e3f3f}{ \frac{d}{dt}m_x=\omega_x(a)-(\lambda+d_m)m_x+\upsilon_x(c_x,a)-k_brm_x+k_uc_x \qquad \textrm{for } x\in\{r,t,m,q\}}\] \[ \color{#3e3f3f}{\frac{d}{dt} c_x=-\lambda c_x+k_brm_x-k_uc_x-\upsilon_x(c_x,a) \qquad \textrm{for } x\in\{r,t,m,q\}}\]

We then developed a model to reflect the production of our recombinant insulin. First we modelled the production of recombinant insulin in an E. coli cytoplasm. We included transcription, translation, folding and aggregation into inclusion bodies, as well as dilution and degradation. See below for details on our model of recombinant insulin expression in the cytoplasm.

Cytoplasmic Expression Model

We modelled the rate of change of five biochemical species in the cell (Table 5)

Table 4. Cytoplasmic Expression Model Variables
Symbol	Meaning
\[\color{#3e3f3f}{m_p}\]	free mRNA of recombinant protein
\[\color{#3e3f3f}{c_p}\]	ribosome-bound mRNA of recombinant protein
\[\color{#3e3f3f}{p_u}\]	Unfolded recombinant protein
\[\color{#3e3f3f}{p_f}\]	Folded recombinant protein
\[\color{#3e3f3f}{p_a}\]	recombinant protein aggregated in inclusion bodies

A diagram showing the species we modelled and notation used is shown below:

Figure 1. Schematic of the processes considered in the cytoplasmic protein expression model. See table 4 and 5 for notation for protein species and rates

Table 5 details the reactions considered in the model

Table 5. List of reactions considered in cytoplasmic protein expression model
Process	Reaction	Rate
Transcription	\[\color{#3e3f3f}{\varnothing\rightarrow m_p}\]	\[\color{#3e3f3f}{\omega_p(a)}\]
Dilution and degradation of mRNA	\[\color{#3e3f3f}{m_p\rightarrow\varnothing}\]	\[\color{#3e3f3f}{\lambda+d_m}\]
ribosome binding	\[\color{#3e3f3f}{r+m_p\rightleftharpoons c_p}\]	\[\color{#3e3f3f}{\textrm{forward: } k_b \textrm{, reverse: } k_u}\]
Dilution of ribosome-bound protein	\[\color{#3e3f3f}{c_p\rightarrow\varnothing}\]	\[\color{#3e3f3f}{\lambda}\]
Translation	\[\color{#3e3f3f}{n_pa+c_p\rightarrow m_p+p_u+r}\]	\[\color{#3e3f3f}{\upsilon_p(c_p,a)}\]
Aggregation	\[\color{#3e3f3f}{p_u\rightarrow p_a}\]	\[\color{#3e3f3f}{k_a}\]
Folding	\[\color{#3e3f3f}{p_u\rightarrow p_f}\]	\[\color{#3e3f3f}{k_f}\]
Dilution and degradation of folded protein	\[\color{#3e3f3f}{p_f\rightarrow \varnothing} \]	\[\color{#3e3f3f}{\lambda+k_d}\]

Here, \(\omega_p(a)\), the rate of transcription, is an energy dependent process.

We used the transcription rate form used in Weisse et al to denote the amount being transcribed (\(\omega_p(a)\)). That is,

\[\color{#3e3f3f}{\omega_p(a)=w_p \frac{a}{\theta_p+a} } \]

Where \(w_p\) is the maximal rate of transcription, dependent on the speed of transcriptional elongation, as well as the gene length, induction and copy number. \(a\) is the energy in the cell such as ATP (transcription is an energy dependent process), and \(\theta_p\) is the transcriptional threshold of the recombinant protein.

In addition, we used the form in Weisse et al. for the translation rate term

\[\color{#3e3f3f}{\upsilon_p(c_p,a)=c_p \frac{\gamma(a)}{n_p} } \]

Where \(n_p\) is the length of recombinant protein, and \(\gamma(a)\) is an expression for the rate of transcriptional elongation:

\[\color{#3e3f3f}{\gamma(a)=\frac{\gamma_{max} a}{K_{\gamma} + a} } \]

Where \(\gamma_{max}\) is the maximal rate of translation, \(K_{\gamma}\) is the translational elongation threshold, and \(a\) is the energy in the cell.

For the model of inclusion body aggregation, we assumed first order deposition of monomers of unfolded protein, dependent on the concentration of unfolded protein. as in Hoffmann et al (2001).

Using the law of mass action kinetics we can derive a set of ordinary differential equations from these reactions.

Summary of Cytoplasmic Expression Model

\[ \color{#3e3f3f}{\frac{d}{dt}{m}_p=\omega_p(a)+\upsilon_p(c_p,a)+k_uc_p-(\lambda +d_m)m_p-k_brm_p} \] \[ \color{#3e3f3f}{\frac{d}{dt}{c}_p=k_brm_p-\lambda c_p-k_uc_p-\upsilon_p(c_p,a)}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_u=\upsilon_p(c_p,a)-(k_f+k_a+\lambda)p_u}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_a=k_ap_u-\lambda p_a}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_f=k_fp_u-(k_d+\lambda) p_f}\]

Parametrising the model

Table 6 shows the parameters we needed to find for our model, and the values we used

Table 6. Cytoplasmic Expression Model Parameters. * Set to 0 as degradation is dominated by the rate of dilution due to cell division for stable proteins [3]
Symbol	Meaning	Default value	Units	Source
\[\color{#3e3f3f}{w_p}\]	Maximal rate of transcription	<10^3	mRNAs/min	Proportional to induction level. Varied around realistic values as recommended by[1]
\[\color{#3e3f3f}{\theta_p}\]	transcriptional threshold of the recombinant protein	4.38	[molecs/cell]	[1]
\[\color{#3e3f3f}{n_p}\]	Length of recombinant protein	312/255	[aa/molecs]	Length of cytoplasmic proinsulin/winsulin gblock link to design/parts page?
\[\color{#3e3f3f}{\gamma_{max}}\]	Maximal rate of translation	1260	[aa/ min molecs]	[1]
\[\color{#3e3f3f}{K_{\gamma}}\]	Translational elongation threshold	7	[molecs/ cell]	[1]
\[\color{#3e3f3f}{k_u}\]	Rate of unbinding of mRNA and ribosomes	1	[/min]	[1]
\[\color{#3e3f3f}{k_b}\]	Rate of binding of mRNA and ribosomes	1	[cell/ min molecs]	[1]
\[\color{#3e3f3f}{d_m}\]	degradation rate of mRNA	0.1	[/min]	[1]
\[\color{#3e3f3f}{k_f}\]	Rate of protein folding	0.14	[/min]	adapted to fit units from [2]
\[\color{#3e3f3f}{k_a}\]	Rate of protein aggregation	0.21	[/min]	adapted to fit units from [2]
\[\color{#3e3f3f}{k_d}\]	Rate of protein degradation	0		*

Next, we modelled the steps in our periplasmic expression system, including transcription, translation, translocation, and folding in the periplasm.

Periplasmic Expression Model

We looked at periplasmic expression of our recombinant protein in E. coli. We modelled the rate of change of 6 species (Table 7)

Cytoplasmic Expression Model Variables
Symbol	Meaning
\[\color{#3e3f3f}{m_p}\]	free mRNA of recombinant protein
\[\color{#3e3f3f}{c_p}\]	ribosome-bound mRNA of recombinant protein
\[\color{#3e3f3f}{p_c}\]	Unfolded recombinant protein in the cytoplasm
\[\color{#3e3f3f}{p_t}\]	Unfolded recombinant protein bound to transporter
\[\color{#3e3f3f}{p_u}\]	Unfolded recombinant protein in the periplasm
\[\color{#3e3f3f}{p_f}\]	Folded recombinant protein in the periplasm

A diagram showing the species we modelled and notation used is shown below (figure 2)

Figure 2. Schematic of the reactions considered in the periplasmic protein expression model. See table 7 and 8 for notation for protein species and rates

The reactions in table 8 were considered

Table 8. List of reactions considered in periplasmic protein expression model
Process	Reaction	Rate
Transcription	\[\color{#3e3f3f}{\varnothing\rightarrow m_p}\]	\[\color{#3e3f3f}{\omega_p(a)}\]
Dilution and degradation of mRNA	\[\color{#3e3f3f}{m_p\rightarrow\varnothing}\]	\[\color{#3e3f3f}{\lambda+d_m}\]
ribosome binding	\[\color{#3e3f3f}{r+m_p\rightleftharpoons c_p}\]	\[\color{#3e3f3f}{\textrm{forward: } k_b \textrm{, reverse: } k_u}\]
Dilution of ribosome-bound protein	\[\color{#3e3f3f}{c_p\rightarrow\varnothing}\]	\[\color{#3e3f3f}{\lambda}\]
Translation	\[\color{#3e3f3f}{n_pa+c_p\rightarrow m_p+p_u+r}\]	\[\color{#3e3f3f}{\upsilon_p(c_p,a)}\]
Translocator binding	\[\color{#3e3f3f}{p_c+t\rightarrow p_t}\]	\[\color{#3e3f3f}{k_bt}\]
Translocation	\[\color{#3e3f3f}{p_t\rightarrow p_u}\] where \(t\) refers to the amount of translocons	\[\color{#3e3f3f}{\tau(p_t,a)}\]
Folding	\[\color{#3e3f3f}{p_u\rightarrow p_f}\]	\[\color{#3e3f3f}{k_f}\]
Dilution and degradation of folded protein	\[\color{#3e3f3f}{p_f\rightarrow \varnothing} \]	\[\color{#3e3f3f}{\lambda+k_d}\]

here, \(\omega(a)\) and \(\upsilon(c_p,a)\) are as in the cytoplasmic reactions. The amount being transported is found with the term \(\tau_p(p_t,a)\). Protein translocation to the periplasm occurs via an ATP-dependent motor protein, secA [4]. Post-translational translocation uses ATP as a stepwisesource of energy to drive the protein through the membrane. It follows mechanism illustrated in Figure 3 [4].

Figure 3. A simplified mechanism of post-translational translocation. The secA-secYEG-protein \((p_c)\) complex binds ATP in a reversible reaction. The secA-bound ATP is hydrolysed, causing the secA to release itself from the protein-secYEG complex. SecA re-binds the protein-secYEG complex, displacing the polypeptide through the channel by ~25 amino acid residues. Binding of ATP to SecA then drives the peptide through another ~25 residues. The steps are repeated \(\frac{n_p}{50}\) times, where \(n_p\) is the length in amino acids of the protein. The polypeptide is then released into the periplasm.

Following the logic used to derive the translation rate in [1], we derive the net rate of translocating a protein \(p\) by defining \(K_p:=\frac{k_1k_2}{k_{-1}+k_2}\). This leads to

\[\color{#3e3f3f}{\tau_p(p_t,a)=p_t\Big(\frac{n_p}{50}\Big(\frac{1}{K_pa}+\frac{1}{k_2} \Big)+\frac{1}{k_t}\Big)^{-1}}\]

If we assume the final termination step is fast, so \(\frac{1}{k_t}<< \frac{n_p}{50}\Big(\frac{1}{K_pa}+\frac{1}{k_2} \Big) \), this is approximately equal to

\[\color{#3e3f3f}{\tau_p(p_t,a)\approx 50p_t \frac{\epsilon(a)}{n_p}\qquad \epsilon(a):=\frac{\epsilon_{max}a}{K_{\epsilon}+a} }\]

Where \(\color{#3e3f3f}{\epsilon_{max}}\) is the maximal translocation rate, \(\color{#3e3f3f}{K_{\epsilon}}\) is the threshold, and \(\color{#3e3f3f}{n_p}\) is the length of the protein in amino acids

Parametrising Translocation

To find the parameters for translocation (\(\color{#3e3f3f}{\epsilon(a)}\)) and (\(\color{#3e3f3f}{K_{\epsilon}}\)), we used kinetic parameters determined in [5]. They measured translocation of a 346aa protein proOmpA and found the apparent Km of SecA was 50nM, and the threshold was 2.7 proOmpa/site/min. The concentration of \(\color{#3e3f3f}{1nM}\) in E. coli is \(\color{#3e3f3f}{\approx}\) 1 molecule/cell [6], so \(\color{#3e3f3f}{K_m=50 molecs/ cell} \). Using the length of proOmpa, the threshold converts to 2.7 \(\cdot\) 346 proOmpA/site/min aa/proOmpa \(\rightarrow\) 934.2 aa/molec/min

Using the law of mass action kinetics we can derive a set of ordinary differential equations from these reactions.

Summary of Periplasmic Expression Model

\[ \color{#3e3f3f}{\frac{d}{dt}{m}_p=\omega_p(a)+\upsilon_p(c_p,a)+k_uc_p-(\lambda +d_m)m_p-k_brm_p} \] \[ \color{#3e3f3f}{\frac{d}{dt}p=k_brm_p-\lambda c_p-k_uc_p-\upsilon_p(c_p,a)}\] \[ \color{#3e3f3f}{\frac{d}{dt}c=\upsilon_p(c_p,a)-(k_{bt}t+\lambda)p_c}\] \[\color{#3e3f3f}{\frac{d}{dt}t=k_{bt}tp_c-\tau_p(p_t,a)-\lambda p_t}\] \[\color{#3e3f3f}{\frac{d}{dt}u=\tau_p(p_t,a)-(k_f+\lambda) p_u}\] \[\color{#3e3f3f}{\frac{d}{dt}f=k_fp_u-(k_d+\lambda)p_f}\]

Parametrising the model

Table 9 shows the parameters we needed to find for our model, and the values we used

Table 9. Periplasmic Expression Model Parameters. † Doubled relative to cytoplasmic folding rate to reflect the effect of an oxidising environment on disulfide bond formation. * Set to 0 as degradation is dominated by the rate of dilution due to cell division for stable proteins [3]
Symbol	Meaning	Default value	Units	Source
\[\color{#3e3f3f}{w_p}\]	Maximal rate of transcription	<10^3	mRNAs/min	Proportional to induction level. Varied around realistic values as recommended by[1]
\[\color{#3e3f3f}{\theta_p}\]	transcriptional threshold of the recombinant protein	4.38	[molecs/cell]	[1]
\[\color{#3e3f3f}{n_p}\]	Length of recombinant protein	312/255	[aa/molecs]	Length of cytoplasmic proinsulin/winsulin gblock link to design/parts page?
\[\color{#3e3f3f}{\gamma_{max}}\]	Maximal rate of translation	1260	[aa/ min molecs]	[1]
\[\color{#3e3f3f}{K_{\gamma}}\]	Translational elongation threshold	7	[molecs/ cell]	[1]
\[\color{#3e3f3f}{k_u}\]	Rate of unbinding of mRNA and ribosomes	1	[/min]	[1]
\[\color{#3e3f3f}{k_b}\]	Rate of binding of mRNA and ribosomes	1	[cell/ min molecs]	[1]
\[\color{#3e3f3f}{d_m}\]	degradation rate of mRNA	0.1	[/min]	[1]
\[\color{#3e3f3f}{t}\]	Number of translocons in a cell	500	[/cell]	[5]
\[\color{#3e3f3f}{k_{bt}}\]	Rate of protein binding to translocon	1	[cell /min molecs]	[1]
\[\color{#3e3f3f}{\epsilon_{max}}\]	Maximal translocation rate	934.2	[aa /min molecs]	[5]
\[\color{#3e3f3f}{K_{\epsilon}}\]	Translocational threshold	50	[molecs/ cell]	[5]
\[\color{#3e3f3f}{k_f}\]	Rate of protein folding	0.28	[/min]	†
\[\color{#3e3f3f}{k_d}\]	Rate of protein degradation	0		*

We also modelled our third expression system in Bacillus,including transcription, translation, secretion, and folding extracellularly.

Bacillus Secretory Expression Model

We also developed a model of our secretory protein expression system in bacillus subtilis. The model included 6 species (table 10)

Table 10. Cytoplasmic Expression Model Variables
Symbol	Meaning
\[\color{#3e3f3f}{m_p}\]	free mRNA of recombinant protein
\[\color{#3e3f3f}{c_p}\]	ribosome-bound mRNA of recombinant protein
\[\color{#3e3f3f}{p_c}\]	Unfolded recombinant protein in the cytoplasm
\[\color{#3e3f3f}{p_t}\]	Unfolded recombinant protein bound to transporter
\[\color{#3e3f3f}{p_u}\]	Unfolded recombinant protein in the medium
\[\color{#3e3f3f}{p_f}\]	Folded recombinant protein in the medium

A diagram showing the species we modelled and notation used is shown below (figure 4)

Figure 4. Schematic of the reactions considered in the periplasmic protein expression model. See table 10 for notation for protein species and rates

Structurally, this is the same process as the periplasmic expression system, so the equations' structure is the same. However the parameters are different, reflecting the different environment of bacillus and medium and its effect on expression of recombinant protein

Summary of Bacillus Secretory Expression Model

\[ \color{#3e3f3f}{\frac{d}{dt}{m}_p=\omega_p(a)+\upsilon_p(c_p,a)+k_uc_p-(\lambda +d_m)m_p-k_brm_p} \] \[ \color{#3e3f3f}{\frac{d}{dt}{c}_p=k_brm_p-\lambda c_p-k_uc_p-\upsilon_p(c_p,a)}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_c=\upsilon_p(c_p,a)-(k_{bt}t+\lambda)p_c}\] \[\color{#3e3f3f}{\frac{d}{dt}{p}_t=k_{bt}tp_c-\tau_p(p_t,a)-\lambda p_t}\] \[\color{#3e3f3f}{\frac{d}{dt}{p}_u=\tau_p(p_t,a)-(k_f+\lambda) p_u}\] \[\color{#3e3f3f}{\frac{d}{dt}{p}_f=k_fp_u-(k_d+\lambda)p_f}\]

We were unfortunately unable to parametrise the bacillus model, so for our in silico experiments we focused on comparing cytoplasmic and periplasmic E. coli expression.

Once we had developed models to reflect our different expression systems, we integrated them into the whole cell model from [1], and

In Silico Experiments

Once we had modelled our different expression systems for recombinant insulin, we integrated them into the whole cell model developed in [1].

We then interrogated these models for insights into how to optimise the expression of insulin, using matlab

Comparing Cytoplasmic and Periplasmic Expression

(A)

(B)

Figure 5. The dynamics of (A) cytoplasmic and (B) periplasmic expression models in the first 25 minutes of recombinant protein expression.

First, we looked at the dynamics of the two models in the first 25 minutes of recombinant protein expression

Cytoplasmic and Periplasmic expression showed very different behaviour. The cytoplasmic model predicted a quick peak in unfolded protein in the cytoplasm which is then depleted, and a large amount of protein aggregating in inclusion bodies (figure 5).

The periplasmic model predicted that unfolded protein in the cytoplasm would be translocated very quickly, which corresponds well to the fact that translocation is a fast event in E. coli [5]. The higher protein folding rate for insulin in the periplasm results in the unfolded protein depleting quickly, resulting in a much higher yield of folded protein predicted by the periplasmic model to the cytoplasmic model.

(A)

(B)

Figure 6. The dynamics of (A) cytoplasmic and (B) periplasmic expression models in the first 250 hours of recombinant protein expression.

After the initial dynamics, the model reaches a steady state for both cytoplasmic and periplasmic expression (figure 6).

The cytoplasmic model predicts that unfolded proteins will continue to aggregate in the cytoplasm to a larger degree than they fold, while the periplasmic model predicts that unfolded protein amount will become negligible. In addition the yield of folded protein in the cytoplasm plateaus at \(\color{#3e3f3f}{7.7014\times10^4}\) while the yield of folded periplasmic protein plateaus at \(\color{#3e3f3f}{19.128\times10^4}\). Therefore the model predicts that periplasmic expression will yield almost 3-fold higher expression of recombinant insulin than cytoplasmic expression.

Parameter Scanning to Optimise Expression

We then wanted to scan parameter values to see how we could optimise folded protein yield.

\(\color{#3e3f3f}{\omega_p}\), the maximal rate of transcription, is proportional to induction level. Varying it is equivalent to varying the concentration of IPTG used to induce expression. We therefore varied the parameter to see how it effected predicted protein expression in the model. We explored \(\color{#3e3f3f}{\omega_p\in [1,10^4]}\) as these are around the bounds of realistic values [1].

(A)

(B)

Figure 7. The effect of \(\color{#3e3f3f}{\omega_p}\) on folded protein yield in (A) cytoplasmic and (B) periplasmic expression models within realistic values.

We found that the yield of folded protein followed a logarithmic increase in relation to \(\color{#3e3f3f}{\omega_p}\) (figure 7). The model predicts that at a low degree of induction (\(\color{#3e3f3f}{\omega_p}<200\)), the yield of folded protein is comparable, however at higher values the cytoplasmic yield is much lower. This correlates well with the fact that inclusion body formation increasing with induction rate, and therefore decreasing the yield of recombinant protein, is a well known issue in synthetic biology [7].

We next asked if there was any parameter we could change in the cytoplasmic expression model so that expression levels in the cytoplasm could match levels in the periplasm, correlating to some experimental step we could take

\(\color{#3e3f3f}{k_f}\), the rate of protein folding in the cytoplasm, greatly affects protein yield as our model supposes that insoluble aggregates of recombinant protein is caused by the association of protein that has not folded properly yet (matching experimental knowledge [8]). Since aggregated protein cannot fold in our model, aggregation sequesters away protein and decreases the folded protein yield. We wanted to know if we could increase \(\color{#3e3f3f}{k_f}\) in the cytoplasmic model to such a degree that cytoplasmic yield matched periplasmic yield.

Figure 8. The effect of \(\color{#3e3f3f}{\omega_p}\) on folded protein yield in (A) cytoplasmic and (B) periplasmic expression models within realistic values.

We found that periplasmic protein yield could not be matched within realistic parameter values of \(\color{#3e3f3f}{k_f}\), however the protein yield did increase with \(\color{#3e3f3f}{k_f}\) (figure 8). Thus, in order to improve protein yield in cytoplasm, we used a SHuffle strain of E. coli, which promotes disulfide bond formation in the periplasm, as our modelling predicted it would improve yield.

References

Weisse, A.Y., Diego, A.O., Danos, V., Swain, P.S. (2015). Mehchanistic links between cellular trade-offs, gene expression, and growth. Proc Natl Acad Sci U S A. 112(9):E1038-47
Hoffman, F., Posten, C., Rinas, U. (2001). Kinetic model of in vivo folding and inclusion body formation in recombinant Escherichia coli. Biotechnol Bioeng. 72(3):315-22
Taniguchi, Y., Choi, P.J., Li, G.W., Chen, H., Babu, M., Hearn, H., Emili, A., Xie, S. (2010). Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells. Science. 329(5991):533-538
Natale, P., Bruser, T., Driessen, A.J.M. (2008). Sec- and Tat-mediated protein secretion across the bacterial cytoplasmic membrane- Distinct translocases and mechanisms. Biochemica et Biophysica Acta- Biomembranes. 1998(9):1735-1756
Keyzer, J., Does, C., Driessen, A. (2002). Kinetic Analysis of the Translocation of Fluorescent Precursor Proteins into Escherichia coli Membrane Vesicles. The Journal of Biological Chemistry. 227:46059-46065
BioNumbers. Key Numbers for Cell Biologists. [online] Available at: http://bionumbers.hms.harvard.edu/Includes/KeyNumbersLinks.pdf
Thomas, J.G., Baneyx, F. (1996) Protein Misfolding and Inclusion Body Formation in Recombinant Escherichia coli Cells Overexpressing Heat-shock Proteins. The Journal of Biological Chemistry<. 271:11141-11147/li>
Upadhyay, A.K., Murmu, A., Sing, A., Panda, A.K. (2012). Kinetics of Inclusion Body Formation and its Correlation with the Characteristics of Protein Aggregates in Escherichia coli. PLoS One. 7(3):e33951

For our physiological modelling, we used a model of subcutaneous insulin absorption developed in [1] and used it to relate the free energy of insulin hexamer formation and insulin dynamics. We then used thermodynamic modelling to make an estimate of the relative time of peak of action, and the duration of action of our novel insulin analogue (winsulin).

The authors of [1] developed a system of partial differential equations to describe the insulin infusion process. They modelled the change in three species:

Table 1. Variables in model of insulin infusion
Symbol	Meaning
\[\color{#3e3f3f}{c_d}\]	Insulin in dimeric form
\[\color{#3e3f3f}{c_h}\]	Insulin in hexamer form
\[\color{#3e3f3f}{c_b}\]	Insulin in bound form

They modelled the conversion between hexameric and dimeric insulin as follows

Insulin_Hexamer \(\color{#3e3f3f}{\rightleftharpoons}\) Insulin_Dimer

Where the forward rate was called \(\color{#3e3f3f}{P}\) and the reverse rate was \(\color{#3e3f3f}{PQ}\) where we can interpret \(\color{#3e3f3f}{P}\) as the production rate and \(\color{#3e3f3f}{Q}\) as the equilibrium constant.

The final model was as follows

\[\color{#3e3f3f}{\eqalignno{{\partial c_{d}(t,r)\over\partial t}=&\,P\left(c_{h}(t,r)-Qc_{d}(t,r)^{3}\right)-B_{d}c_{d}(t,r)\cr&+D\nabla^{2}c_{d}(t,r),\cr{\partial c_{h}(t,r)\over\partial t}=&\,-P\left(c_{h}(t,r)-Qc_{d}(t,r)^{3}\right)\cr&+D\nabla^{2}c_{h}(t,r)}}\]

Where \(\color{#3e3f3f}{P, Q, B_d, D}\) are parameters, and exogenous insulin flow is obtained by integrating the expression denoting the amount of insulin dimer entering the bloodstream: \[\color{#3e3f3f}{I_{ex}(t)=B_{d}\int\limits_{V_{sc}}c_{d}(t,r)dV.}\]

The parameters found for the different insulin analogues and their resultant insulin dynamics predicted by the model are shown in Table 2

Table 2. Parameter Values and resultant Dynamics for different insulin analogues. Values of parameters from [1] Table IV. Insulin dynamics taken from [1] Fig. 6
Insulin Analogue	\(\color{#3e3f3f}{Q}\)	\(\color{#3e3f3f}{D}\)	\(\color{#3e3f3f}{B_d}\)	Time of peak Insulin action (hours)	Duration of Insulin action (hours)
Lispro, Humalog, NovoRapid	\[\color{#3e3f3f}{4.75\cdot 10^{-4}}\]	\[\color{#3e3f3f}{3.36\cdot 10^{-4}}\]	\[\color{#3e3f3f}{2.36\cdot 10^{-2}}\]	\[\color{#3e3f3f}{0.25}\]	\[\color{#3e3f3f}{4}\]
Actrapid	\[\color{#3e3f3f}{1.9\cdot 10^{-3}}\]	\[\color{#3e3f3f}{8.4\cdot 10^{-5}}\]	\[\color{#3e3f3f}{1.18\cdot 10^{-2}}\]	\[\color{#3e3f3f}{0.75}\]	\[\color{#3e3f3f}{8}\]
Semilente	\[\color{#3e3f3f}{7.6\cdot 10^{-2}}\]	\[\color{#3e3f3f}{8.4\cdot 10^{-5}}\]	\[\color{#3e3f3f}{1.18\cdot 10^{-2}}\]	\[\color{#3e3f3f}{1.3}\]	\[\color{#3e3f3f}{11}\]
NPH	\[\color{#3e3f3f}{3.04}\]	\[\color{#3e3f3f}{8.4\cdot 10^{-5}}\]	\[\color{#3e3f3f}{1.18\cdot 10^{-2}}\]	\[\color{#3e3f3f}{4.5}\]	\[\color{#3e3f3f}{16}\]

Since the parameter \(\color{#3e3f3f}{Q}\) seemed to have the most impact on insulin dynamics, we tested if there was a relationship between the two (figure 1).

Figure 1. A semilog plot of the Time of peak insulin action (hours) and Duration of Insulin action (hours) after injection of 8IU insulin as a function of the parameter Q. Nonlinear regression analysis was performed using GraphPad PRISM 7. \(R^2\) values are shown on graph.

Now, since \(\color{#3e3f3f}{Q}\) in the model formed in [1] is the equilbrium constant of the reaction Insulin_Hexamer \(\color{#3e3f3f}{\rightleftharpoons}\) Insulin_Dimer, it is related to the Gibbs free energy of the reaction by the expression \(\color{#3e3f3f}{\Delta G^{o}=-RT\ln{Q}}\), where \(\color{#3e3f3f}{R=8.314472 J K^{-1} mol^{-1}}\) is the gas constant and \(T\) is the temperature in kelvins.

Therefore if we know the Gibbs free energy of insulin hexamer formation, we can use this to find some qualitative information on the dynamics of insulin absorption using the model from [1], and thus estimate the time of peak insulin action and the duration of insulin action from thermodynamic information.

The Mutabind tool [2] computationally predicts the \(\color{#3e3f3f}{\Delta\Delta G}\) (change in binding affinity of of protein-protein interactions) of point mutations relative to a known structure. We used the server to predict the effects of our variations to proinsulin's sequence on protein-protein interactions within the insulin hexamer, and thus their effects on the \(\color{#3e3f3f}{\Delta G}\) of hexamer formation. We used PDB file 3AIY and inputted the sequence variants we had designed our winsulin with. Since the B chains are buried at the center of the insulin hexamer (see 3AIY), we analysed the effects of the mutations in this chain on hexamer stability (figure 2)

Figure 2. Alignment of winsulin and human insulin. Residues highlighted yellow were sequence variants inputted into the Mutabind program.

Mutabind predicted that all of our sequence changes would decrease the stability of the insulin hexamer for our analogue. Results are shown in table 3

Table 3. Mutabind results.
Mutation	\(\color{#3e3f3f}{\Delta\Delta G_{bind} (kcal mol^{-1})}\)
\[\color{#3e3f3f}{H10D}\]	\[\color{#3e3f3f}{0.51}\]
\[\color{#3e3f3f}{T27S}\]	\[\color{#3e3f3f}{0.57}\]
\[\color{#3e3f3f}{K295}\]	\[\color{#3e3f3f}{0.62}\]

Where \(\color{#3e3f3f}{\Delta\Delta G_{bind} (kcal mol^{-1})}\) is the predicted change in binding affinity induced by a mutation. A positive result corresponds to destabilising mutations, so the hexamer formation of winsulin will be less stable than that of human insulin.

This corresponds to a decrease in \(\color{#3e3f3f}{Q}\), meaning we predict our winsulin analogue will be relatively fast acting, compared to regular human insulin. Human insulin activity peaks in 2-4 hours and lasts for 6-8 hours [3] so this would make winsulin a rapid-acting analogue.

Although this is a crude estimate, it does give us some qualitative information on the action profile of our novel winsulin

References

Tarin, C., Teufel, E., Pico, J., Bondia, J., Pfleiderer, H.J. (2005). Comprehensive pharmacokinetic model of insulin Glargine and other insulin formulations. IEEE Transactions on Biomedical Engineering, vol. 52, no. 12, pp. 1994-2005
Li, M., Simonetti, F. L., Goncearenco, A., & Panchenko, A. R. (2016). MutaBind estimates and interprets the effects of sequence variants on protein–protein interactions. Nucleic Acids Research, 44(Web Server issue), W494–W501. http://doi.org/10.1093/nar/gkw374
Diabetes Education Online. 2017. Types of Insulin. [ONLINE] Available at: https://dtc.ucsf.edu/types-of-diabetes/type2/treatment-of-type-2-diabetes/medications-and-therapies/type-2-insulin-rx/types-of-insulin/.

Put LAO content here

Put USA content here

Put EGY content here

Put GBR content here

Put CAN content here

Put IRL content here

Put NLD content here

Put UKR content here

Put AUT content here

Put GEO content here

Put ESP content here

Put TUR content here

Put CHN content here

Put SYR content here

Put ISR content here

Put PAK content here

Put NPL content here

Put ARE content here

Put IND content here

Put DOM content here

Put MLI content here

Put GTM content here

Put YEM content here

Put VNM content here

Put SLV content here

Put SDN content here

Put PHL content here

Put KHM content here

Put CRI content here

Put ETH content here

Put NGA content here

Put GHA content here

Put LKA content here

Put CMR content here

Put UGA content here

Put KEN content here

Put ECU content here

Put IDN content here

Put COG content here

Put TZA content here

Put PER content here

Put ZMB content here

Put BRA content here

Put MUS content here

Put AUS content here

Put ZAF content here

Put ARG content here

Put NZL content here

@@ Line 350: / Line 350: @@
 <body>
 <div class="container-fluid">
-<div class="row center" style="margin-top:68px;">
+<div class="row" style="margin-top:68px;">
        <img src="https://static.igem.org/mediawiki/2017/a/a1/T--Sydney_Australia--modelling_banner.png" class="crispy" width = "80%" style="padding:40px;">
 </div>

Difference between revisions of "Team:Sydney Australia/Model"

Revision as of 08:36, 29 October 2017

Experimental

Physiological

Economic

Whole Cell Model

Cytoplasmic Expression Model

Summary of Cytoplasmic Expression Model

Parametrising the model

Periplasmic Expression Model

Parametrising Translocation

Summary of Periplasmic Expression Model

Parametrising the model

Bacillus Secretory Expression Model

Summary of Bacillus Secretory Expression Model

In Silico Experiments

Comparing Cytoplasmic and Periplasmic Expression

Parameter Scanning to Optimise Expression

References

References