Difference between revisions of "Team:Sydney Australia/Model"

Line 350: Line 350:
 
<body>
 
<body>
 
<div class="container-fluid">
 
<div class="container-fluid">
<div class="row center" style="margin-top:68px;">
+
<div class="row" style="margin-top:68px;">
 
       <img src="https://static.igem.org/mediawiki/2017/a/a1/T--Sydney_Australia--modelling_banner.png" class="crispy" width = "80%" style="padding:40px;">
 
       <img src="https://static.igem.org/mediawiki/2017/a/a1/T--Sydney_Australia--modelling_banner.png" class="crispy" width = "80%" style="padding:40px;">
 
</div>
 
</div>

Revision as of 08:36, 29 October 2017



The aim of modelling in synthetic biology is to simulate the behaviour of your project to gain insight into how to best improve it. For our project, we saw three levels at which modelling could aid in the pursuit of its central aims.


As has been explored on our integrated human practices and applied design pages, the problem of insulin accessibility is complex and multi-faceted. As such, we decided it was not enough to consider our project as a problem whose solution could be found solely in a test tube. Distilled down, our project can be viewed as three sequential aims which we believe together can be used to address insulin accessibility.

Our modelling efforts were split into three branches, which reflected these major aspects of our project


Difficulties optimising production of recombinant are a key issue in the state of its accessibility.


in silico experiments to simulate how best to optimise expression led to theoretical insights which informed the direction of our efforts.

It is imperative to test the feasibility of our recombinant insulin as a therapeutic for diabetics.


Modelling the effects of changes to insulin’s biochemical makeup on its therapeutic effects supplement our wet-lab efforts to characterise our molecule

In addition, the project would be moot without a consideration of the insulin market as a whole.


Modelling helped us to gain insight into the global insulin market, which informed our approach towards entrepeneurship.

Experimental

Physiological

Economic

We began our experimental modelling by using a mechanistic model of an E. Coli cell developed in [1]. The methodology behind integrating models of our expression system into this model was to more accurately reflect reality. Recombinant protein expression occurs within a complex cellular environment with finite resources. A model which ignores the actiities of the host cells would ignore important host-circuit interactions. Ignoring the finite resources of the cell may skew our prediction of the yield of our expression systems. See below for details on the whole cell model we used.

A model of the E. Coli cell including nutrient import, its conversion to cellular energy, and the transcription and translation of four categories of proteins was developed in [1]. It builds into the model considerations of the finite levels of cellular energy, ribosomes and cell mass.

Tables 1 and 2 detail the list of reactions that were considered in the model.


Table 1: List of reactions relating to the expression and degradation of four protein species considered in whole cell model developed in [1]
Protein Species (symbol) Dilution of protein Transcription Dilution/degradation of mRNA Ribosome binding Dilution of ribosome-bound mRNA Translation
Ribosomes \(\color{#3e3f3f}{(r)}\) \[ \color{#3e3f3f}{ r\xrightarrow{\lambda}\varnothing }\] \[ \color{#3e3f3f}{ \varnothing\xrightarrow{\omega_r}m_r}\] \[ \color{#3e3f3f}{ m_r\xrightarrow{\lambda+d_m}\varnothing}\] \[ \color{#3e3f3f}{ r+m_r\xrightarrow{k_b, k_u}c_r}\] \[ \color{#3e3f3f}{ c_r\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{ n_ra+c_r\xrightarrow{\upsilon_r}r+m_r+r}\]
Transporter enzyme \(\color{#3e3f3f}{(e_t)}\) \[ \color{#3e3f3f}{ e_t\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{\varnothing\xrightarrow{\omega_r}m_t}\] \[ \color{#3e3f3f}{ m_t\xrightarrow{\lambda+d_m}\varnothing}\] \[ \color{#3e3f3f}{ r+m_t\xrightarrow{k_b, k_u}c_t}\] \[ \color{#3e3f3f}{ c_t\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{ n_ta+c_t\xrightarrow{\upsilon_t}r+m_t+e_t}\]
Metabolic enzyme \(\color{#3e3f3f}{(e_m)}\) \[ \color{#3e3f3f}{ e_m\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{ \varnothing\xrightarrow{\omega_r}m_m}\] \[ \color{#3e3f3f}{m_m\xrightarrow{\lambda+d_m}\varnothing}\] \[ \color{#3e3f3f}{ r+m_m\xrightarrow{k_b, k_u}c_m}\] \[ \color{#3e3f3f}{ c_m\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{ n_ma+c_m\xrightarrow{\upsilon_m}r+m_m+e_m}\]
Growth-independent/ housekeeping proteins \(\color{#3e3f3f}{(q)}\) \[ \color{#3e3f3f}{ q\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{ \varnothing\xrightarrow{\omega_r}m_q}\] \[ \color{#3e3f3f}{ m_q\xrightarrow{\lambda+d_m}\varnothing}\] \[ \color{#3e3f3f}{ r+m_q\xrightarrow{k_b, k_u}c_q}\] \[ \color{#3e3f3f}{ c_q\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{ n_qa+c_q\xrightarrow{\upsilon_q}r+m_q+q}\]

Table 2: Nutrient metabolism and cellular energy reactions in whole cell model developed in [1]
Protein Species (symbol) Dilution of protein Nutrient Import Metabolism
Internal Nutrient \(\color{#3e3f3f}{(s_i)}\) \[ \color{#3e3f3f}{ s_i\xrightarrow{\lambda}\varnothing}\] \[ \color{#3e3f3f}{ s\xrightarrow{\upsilon_{imp}}s_i}\] \[ \color{#3e3f3f}{ s_i\xrightarrow{\upsilon_{cat}}n_sa}\]
ATP \(\color{#3e3f3f}{(a)}\) \[ \color{#3e3f3f}{ a\xrightarrow{\lambda}\varnothing}\] - -


See table 3 for notation relating to rates and parameters in [1].
Table 3: Notation for rates in Weisse et al. model
Symbol Meaning
\[ \color{#3e3f3f}{ \upsilon_{imp}}\] Rate of nutrient import
\[\color{#3e3f3f}{\upsilon_{cat}}\] Rate of nutrient metabolism
\[\color{#3e3f3f}{\lambda}\] Growth Rate
\[\color{#3e3f3f}{n_x\textrm{ with } x\in\{r,t,m,q\}}\] length of proteins of different species
\[\color{#3e3f3f}{\upsilon_x \textrm{ with } x\in\{r,t,m,q\}}\] Rate of translating protein species'
\[\color{#3e3f3f}{k_b}\] mRNA ribosome binding rate
\[\color{#3e3f3f}{k_u}\] mRNA ribosome unbinding rate
\[\color{#3e3f3f}{\omega_x \textrm{ with } x\in\{r,t,m,q\}}\] Transcription rates of the four species of proteins
\[\color{#3e3f3f}{d_m}\] mRNA degradation rate

A system of 14 differential equations were derived from these reactions. \[ \color{#3e3f3f}{ \frac{d}{dt}s_i=\upsilon_{imp} (e_t,s)-\upsilon_{cat}(e_m,s_i)-\lambda s_i}\] \[ \color{#3e3f3f}{ \frac{d}{dt}a=n_s\cdot\upsilon_{cat}(e_m,s_i)-\sum_{x\in\{r,t,m,q\}}n_x\upsilon_x(c_x,a)-\lambda a}\] \[ \color{#3e3f3f}{ \frac{d}{dt}r=\upsilon_r(c_r,a)-\lambda r+\sum_{x\in\{r,t,m,q\}} (\upsilon_x(c_x,a)-k_brm_x+k_uc_x)}\] \[ \color{#3e3f3f}{ \frac{d}{dt}e_r=\upsilon_t(c_t,a)-\lambda e_t}\] \[ \color{#3e3f3f}{ \frac{d}{dt}e_m=\upsilon_m(c_m,a)-\lambda e_m}\] \[ \color{#3e3f3f}{ \frac{d}{dt}q=\upsilon_q(c_q,a)-\lambda q}\] \[ \color{#3e3f3f}{ \frac{d}{dt}m_x=\omega_x(a)-(\lambda+d_m)m_x+\upsilon_x(c_x,a)-k_brm_x+k_uc_x \qquad \textrm{for } x\in\{r,t,m,q\}}\] \[ \color{#3e3f3f}{\frac{d}{dt} c_x=-\lambda c_x+k_brm_x-k_uc_x-\upsilon_x(c_x,a) \qquad \textrm{for } x\in\{r,t,m,q\}}\]

We then developed a model to reflect the production of our recombinant insulin. First we modelled the production of recombinant insulin in an E. coli cytoplasm. We included transcription, translation, folding and aggregation into inclusion bodies, as well as dilution and degradation. See below for details on our model of recombinant insulin expression in the cytoplasm.

We modelled the rate of change of five biochemical species in the cell (Table 5)


Table 4. Cytoplasmic Expression Model Variables
Symbol Meaning
\[\color{#3e3f3f}{m_p}\] free mRNA of recombinant protein
\[\color{#3e3f3f}{c_p}\] ribosome-bound mRNA of recombinant protein
\[\color{#3e3f3f}{p_u}\] Unfolded recombinant protein
\[\color{#3e3f3f}{p_f}\] Folded recombinant protein
\[\color{#3e3f3f}{p_a}\] recombinant protein aggregated in inclusion bodies


A diagram showing the species we modelled and notation used is shown below:


Figure 1. Schematic of the processes considered in the cytoplasmic protein expression model. See table 4 and 5 for notation for protein species and rates

Table 5 details the reactions considered in the model


Table 5. List of reactions considered in cytoplasmic protein expression model
Process Reaction Rate
Transcription \[\color{#3e3f3f}{\varnothing\rightarrow m_p}\] \[\color{#3e3f3f}{\omega_p(a)}\]
Dilution and degradation of mRNA \[\color{#3e3f3f}{m_p\rightarrow\varnothing}\] \[\color{#3e3f3f}{\lambda+d_m}\]
ribosome binding \[\color{#3e3f3f}{r+m_p\rightleftharpoons c_p}\] \[\color{#3e3f3f}{\textrm{forward: } k_b \textrm{, reverse: } k_u}\]
Dilution of ribosome-bound protein \[\color{#3e3f3f}{c_p\rightarrow\varnothing}\] \[\color{#3e3f3f}{\lambda}\]
Translation \[\color{#3e3f3f}{n_pa+c_p\rightarrow m_p+p_u+r}\] \[\color{#3e3f3f}{\upsilon_p(c_p,a)}\]
Aggregation \[\color{#3e3f3f}{p_u\rightarrow p_a}\] \[\color{#3e3f3f}{k_a}\]
Folding \[\color{#3e3f3f}{p_u\rightarrow p_f}\] \[\color{#3e3f3f}{k_f}\]
Dilution and degradation of folded protein \[\color{#3e3f3f}{p_f\rightarrow \varnothing} \] \[\color{#3e3f3f}{\lambda+k_d}\]

Here, \(\omega_p(a)\), the rate of transcription, is an energy dependent process.

We used the transcription rate form used in Weisse et al to denote the amount being transcribed (\(\omega_p(a)\)). That is,

\[\color{#3e3f3f}{\omega_p(a)=w_p \frac{a}{\theta_p+a} } \]

Where \(w_p\) is the maximal rate of transcription, dependent on the speed of transcriptional elongation, as well as the gene length, induction and copy number. \(a\) is the energy in the cell such as ATP (transcription is an energy dependent process), and \(\theta_p\) is the transcriptional threshold of the recombinant protein.



In addition, we used the form in Weisse et al. for the translation rate term

\[\color{#3e3f3f}{\upsilon_p(c_p,a)=c_p \frac{\gamma(a)}{n_p} } \]

Where \(n_p\) is the length of recombinant protein, and \(\gamma(a)\) is an expression for the rate of transcriptional elongation:

\[\color{#3e3f3f}{\gamma(a)=\frac{\gamma_{max} a}{K_{\gamma} + a} } \]

Where \(\gamma_{max}\) is the maximal rate of translation, \(K_{\gamma}\) is the translational elongation threshold, and \(a\) is the energy in the cell.

For the model of inclusion body aggregation, we assumed first order deposition of monomers of unfolded protein, dependent on the concentration of unfolded protein. as in Hoffmann et al (2001).


Using the law of mass action kinetics we can derive a set of ordinary differential equations from these reactions.

Summary of Cytoplasmic Expression Model

\[ \color{#3e3f3f}{\frac{d}{dt}{m}_p=\omega_p(a)+\upsilon_p(c_p,a)+k_uc_p-(\lambda +d_m)m_p-k_brm_p} \] \[ \color{#3e3f3f}{\frac{d}{dt}{c}_p=k_brm_p-\lambda c_p-k_uc_p-\upsilon_p(c_p,a)}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_u=\upsilon_p(c_p,a)-(k_f+k_a+\lambda)p_u}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_a=k_ap_u-\lambda p_a}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_f=k_fp_u-(k_d+\lambda) p_f}\]

Parametrising the model

Table 6 shows the parameters we needed to find for our model, and the values we used


Table 6. Cytoplasmic Expression Model Parameters. * Set to 0 as degradation is dominated by the rate of dilution due to cell division for stable proteins [3]
Symbol Meaning Default value Units Source
\[\color{#3e3f3f}{w_p}\] Maximal rate of transcription <10^3 mRNAs/min Proportional to induction level. Varied around realistic values as recommended by[1]
\[\color{#3e3f3f}{\theta_p}\] transcriptional threshold of the recombinant protein 4.38 [molecs/cell] [1]
\[\color{#3e3f3f}{n_p}\] Length of recombinant protein 312/255 [aa/molecs] Length of cytoplasmic proinsulin/winsulin gblock *link to design/parts page?*
\[\color{#3e3f3f}{\gamma_{max}}\] Maximal rate of translation 1260 [aa/ min molecs] [1]
\[\color{#3e3f3f}{K_{\gamma}}\] Translational elongation threshold 7 [molecs/ cell] [1]
\[\color{#3e3f3f}{k_u}\] Rate of unbinding of mRNA and ribosomes 1 [/min] [1]
\[\color{#3e3f3f}{k_b}\] Rate of binding of mRNA and ribosomes 1 [cell/ min molecs] [1]
\[\color{#3e3f3f}{d_m}\] degradation rate of mRNA 0.1 [/min] [1]
\[\color{#3e3f3f}{k_f}\] Rate of protein folding 0.14 [/min] adapted to fit units from [2]
\[\color{#3e3f3f}{k_a}\] Rate of protein aggregation 0.21 [/min] adapted to fit units from [2]
\[\color{#3e3f3f}{k_d}\] Rate of protein degradation 0 *


Next, we modelled the steps in our periplasmic expression system, including transcription, translation, translocation, and folding in the periplasm.

We looked at periplasmic expression of our recombinant protein in E. coli. We modelled the rate of change of 6 species (Table 7)


Cytoplasmic Expression Model Variables
Symbol Meaning
\[\color{#3e3f3f}{m_p}\] free mRNA of recombinant protein
\[\color{#3e3f3f}{c_p}\] ribosome-bound mRNA of recombinant protein
\[\color{#3e3f3f}{p_c}\] Unfolded recombinant protein in the cytoplasm
\[\color{#3e3f3f}{p_t}\] Unfolded recombinant protein bound to transporter
\[\color{#3e3f3f}{p_u}\] Unfolded recombinant protein in the periplasm
\[\color{#3e3f3f}{p_f}\] Folded recombinant protein in the periplasm


A diagram showing the species we modelled and notation used is shown below (figure 2)


Figure 2. Schematic of the reactions considered in the periplasmic protein expression model. See table 7 and 8 for notation for protein species and rates

The reactions in table 8 were considered


Table 8. List of reactions considered in periplasmic protein expression model
Process Reaction Rate
Transcription \[\color{#3e3f3f}{\varnothing\rightarrow m_p}\] \[\color{#3e3f3f}{\omega_p(a)}\]
Dilution and degradation of mRNA \[\color{#3e3f3f}{m_p\rightarrow\varnothing}\] \[\color{#3e3f3f}{\lambda+d_m}\]
ribosome binding \[\color{#3e3f3f}{r+m_p\rightleftharpoons c_p}\] \[\color{#3e3f3f}{\textrm{forward: } k_b \textrm{, reverse: } k_u}\]
Dilution of ribosome-bound protein \[\color{#3e3f3f}{c_p\rightarrow\varnothing}\] \[\color{#3e3f3f}{\lambda}\]
Translation \[\color{#3e3f3f}{n_pa+c_p\rightarrow m_p+p_u+r}\] \[\color{#3e3f3f}{\upsilon_p(c_p,a)}\]
Translocator binding \[\color{#3e3f3f}{p_c+t\rightarrow p_t}\] \[\color{#3e3f3f}{k_bt}\]
Translocation \[\color{#3e3f3f}{p_t\rightarrow p_u}\] where \(t\) refers to the amount of translocons \[\color{#3e3f3f}{\tau(p_t,a)}\]
Folding \[\color{#3e3f3f}{p_u\rightarrow p_f}\] \[\color{#3e3f3f}{k_f}\]
Dilution and degradation of folded protein \[\color{#3e3f3f}{p_f\rightarrow \varnothing} \] \[\color{#3e3f3f}{\lambda+k_d}\]


here, \(\omega(a)\) and \(\upsilon(c_p,a)\) are as in the cytoplasmic reactions. The amount being transported is found with the term \(\tau_p(p_t,a)\). Protein translocation to the periplasm occurs via an ATP-dependent motor protein, secA [4]. Post-translational translocation uses ATP as a stepwisesource of energy to drive the protein through the membrane. It follows mechanism illustrated in Figure 3 [4].


Figure 3. A simplified mechanism of post-translational translocation. The secA-secYEG-protein \((p_c)\) complex binds ATP in a reversible reaction. The secA-bound ATP is hydrolysed, causing the secA to release itself from the protein-secYEG complex. SecA re-binds the protein-secYEG complex, displacing the polypeptide through the channel by ~25 amino acid residues. Binding of ATP to SecA then drives the peptide through another ~25 residues. The steps are repeated \(\frac{n_p}{50}\) times, where \(n_p\) is the length in amino acids of the protein. The polypeptide is then released into the periplasm.


Following the logic used to derive the translation rate in [1], we derive the net rate of translocating a protein \(p\) by defining \(K_p:=\frac{k_1k_2}{k_{-1}+k_2}\). This leads to

\[\color{#3e3f3f}{\tau_p(p_t,a)=p_t\Big(\frac{n_p}{50}\Big(\frac{1}{K_pa}+\frac{1}{k_2} \Big)+\frac{1}{k_t}\Big)^{-1}}\]

If we assume the final termination step is fast, so \(\frac{1}{k_t}<< \frac{n_p}{50}\Big(\frac{1}{K_pa}+\frac{1}{k_2} \Big) \), this is approximately equal to

\[\color{#3e3f3f}{\tau_p(p_t,a)\approx 50p_t \frac{\epsilon(a)}{n_p}\qquad \epsilon(a):=\frac{\epsilon_{max}a}{K_{\epsilon}+a} }\]

Where \(\color{#3e3f3f}{\epsilon_{max}}\) is the maximal translocation rate, \(\color{#3e3f3f}{K_{\epsilon}}\) is the threshold, and \(\color{#3e3f3f}{n_p}\) is the length of the protein in amino acids


Parametrising Translocation

To find the parameters for translocation (\(\color{#3e3f3f}{\epsilon(a)}\)) and (\(\color{#3e3f3f}{K_{\epsilon}}\)), we used kinetic parameters determined in [5]. They measured translocation of a 346aa protein proOmpA and found the apparent Km of SecA was 50nM, and the threshold was 2.7 proOmpa/site/min. The concentration of \(\color{#3e3f3f}{1nM}\) in E. coli is \(\color{#3e3f3f}{\approx}\) 1 molecule/cell [6], so \(\color{#3e3f3f}{K_m=50 molecs/ cell} \). Using the length of proOmpa, the threshold converts to 2.7 \(\cdot\) 346 proOmpA/site/min aa/proOmpa \(\rightarrow\) 934.2 aa/molec/min

Using the law of mass action kinetics we can derive a set of ordinary differential equations from these reactions.

Summary of Periplasmic Expression Model

\[ \color{#3e3f3f}{\frac{d}{dt}{m}_p=\omega_p(a)+\upsilon_p(c_p,a)+k_uc_p-(\lambda +d_m)m_p-k_brm_p} \] \[ \color{#3e3f3f}{\frac{d}{dt}p=k_brm_p-\lambda c_p-k_uc_p-\upsilon_p(c_p,a)}\] \[ \color{#3e3f3f}{\frac{d}{dt}c=\upsilon_p(c_p,a)-(k_{bt}t+\lambda)p_c}\] \[\color{#3e3f3f}{\frac{d}{dt}t=k_{bt}tp_c-\tau_p(p_t,a)-\lambda p_t}\] \[\color{#3e3f3f}{\frac{d}{dt}u=\tau_p(p_t,a)-(k_f+\lambda) p_u}\] \[\color{#3e3f3f}{\frac{d}{dt}f=k_fp_u-(k_d+\lambda)p_f}\]

Parametrising the model

Table 9 shows the parameters we needed to find for our model, and the values we used


Table 9. Periplasmic Expression Model Parameters. † Doubled relative to cytoplasmic folding rate to reflect the effect of an oxidising environment on disulfide bond formation. * Set to 0 as degradation is dominated by the rate of dilution due to cell division for stable proteins [3]
Symbol Meaning Default value Units Source
\[\color{#3e3f3f}{w_p}\] Maximal rate of transcription <10^3 mRNAs/min Proportional to induction level. Varied around realistic values as recommended by[1]
\[\color{#3e3f3f}{\theta_p}\] transcriptional threshold of the recombinant protein 4.38 [molecs/cell] [1]
\[\color{#3e3f3f}{n_p}\] Length of recombinant protein 312/255 [aa/molecs] Length of cytoplasmic proinsulin/winsulin gblock *link to design/parts page?*
\[\color{#3e3f3f}{\gamma_{max}}\] Maximal rate of translation 1260 [aa/ min molecs] [1]
\[\color{#3e3f3f}{K_{\gamma}}\] Translational elongation threshold 7 [molecs/ cell] [1]
\[\color{#3e3f3f}{k_u}\] Rate of unbinding of mRNA and ribosomes 1 [/min] [1]
\[\color{#3e3f3f}{k_b}\] Rate of binding of mRNA and ribosomes 1 [cell/ min molecs] [1]
\[\color{#3e3f3f}{d_m}\] degradation rate of mRNA 0.1 [/min] [1]
\[\color{#3e3f3f}{t}\] Number of translocons in a cell 500 [/cell] [5]
\[\color{#3e3f3f}{k_{bt}}\] Rate of protein binding to translocon 1 [cell /min molecs] [1]
\[\color{#3e3f3f}{\epsilon_{max}}\] Maximal translocation rate 934.2 [aa /min molecs] [5]
\[\color{#3e3f3f}{K_{\epsilon}}\] Translocational threshold 50 [molecs/ cell] [5]
\[\color{#3e3f3f}{k_f}\] Rate of protein folding 0.28 [/min]
\[\color{#3e3f3f}{k_d}\] Rate of protein degradation 0 *


We also modelled our third expression system in Bacillus,including transcription, translation, secretion, and folding extracellularly.

We also developed a model of our secretory protein expression system in bacillus subtilis. The model included 6 species (table 10)


Table 10. Cytoplasmic Expression Model Variables
Symbol Meaning
\[\color{#3e3f3f}{m_p}\] free mRNA of recombinant protein
\[\color{#3e3f3f}{c_p}\] ribosome-bound mRNA of recombinant protein
\[\color{#3e3f3f}{p_c}\] Unfolded recombinant protein in the cytoplasm
\[\color{#3e3f3f}{p_t}\] Unfolded recombinant protein bound to transporter
\[\color{#3e3f3f}{p_u}\] Unfolded recombinant protein in the medium
\[\color{#3e3f3f}{p_f}\] Folded recombinant protein in the medium


A diagram showing the species we modelled and notation used is shown below (figure 4)


Figure 4. Schematic of the reactions considered in the periplasmic protein expression model. See table 10 for notation for protein species and rates


Structurally, this is the same process as the periplasmic expression system, so the equations' structure is the same. However the parameters are different, reflecting the different environment of bacillus and medium and its effect on expression of recombinant protein

Summary of Bacillus Secretory Expression Model

\[ \color{#3e3f3f}{\frac{d}{dt}{m}_p=\omega_p(a)+\upsilon_p(c_p,a)+k_uc_p-(\lambda +d_m)m_p-k_brm_p} \] \[ \color{#3e3f3f}{\frac{d}{dt}{c}_p=k_brm_p-\lambda c_p-k_uc_p-\upsilon_p(c_p,a)}\] \[ \color{#3e3f3f}{\frac{d}{dt}{p}_c=\upsilon_p(c_p,a)-(k_{bt}t+\lambda)p_c}\] \[\color{#3e3f3f}{\frac{d}{dt}{p}_t=k_{bt}tp_c-\tau_p(p_t,a)-\lambda p_t}\] \[\color{#3e3f3f}{\frac{d}{dt}{p}_u=\tau_p(p_t,a)-(k_f+\lambda) p_u}\] \[\color{#3e3f3f}{\frac{d}{dt}{p}_f=k_fp_u-(k_d+\lambda)p_f}\]

We were unfortunately unable to parametrise the bacillus model, so for our in silico experiments we focused on comparing cytoplasmic and periplasmic E. coli expression.


Once we had developed models to reflect our different expression systems, we integrated them into the whole cell model from [1], and

Once we had modelled our different expression systems for recombinant insulin, we integrated them into the whole cell model developed in [1].

We then interrogated these models for insights into how to optimise the expression of insulin, using matlab

Comparing Cytoplasmic and Periplasmic Expression

(A)
(B)
Figure 5. The dynamics of (A) cytoplasmic and (B) periplasmic expression models in the first 25 minutes of recombinant protein expression.


First, we looked at the dynamics of the two models in the first 25 minutes of recombinant protein expression

Cytoplasmic and Periplasmic expression showed very different behaviour. The cytoplasmic model predicted a quick peak in unfolded protein in the cytoplasm which is then depleted, and a large amount of protein aggregating in inclusion bodies (figure 5).

The periplasmic model predicted that unfolded protein in the cytoplasm would be translocated very quickly, which corresponds well to the fact that translocation is a fast event in E. coli [5]. The higher protein folding rate for insulin in the periplasm results in the unfolded protein depleting quickly, resulting in a much higher yield of folded protein predicted by the periplasmic model to the cytoplasmic model.

(A)
(B)
Figure 6. The dynamics of (A) cytoplasmic and (B) periplasmic expression models in the first 250 hours of recombinant protein expression.


After the initial dynamics, the model reaches a steady state for both cytoplasmic and periplasmic expression (figure 6).

The cytoplasmic model predicts that unfolded proteins will continue to aggregate in the cytoplasm to a larger degree than they fold, while the periplasmic model predicts that unfolded protein amount will become negligible. In addition the yield of folded protein in the cytoplasm plateaus at \(\color{#3e3f3f}{7.7014\times10^4}\) while the yield of folded periplasmic protein plateaus at \(\color{#3e3f3f}{19.128\times10^4}\). Therefore the model predicts that periplasmic expression will yield almost 3-fold higher expression of recombinant insulin than cytoplasmic expression.

Parameter Scanning to Optimise Expression

We then wanted to scan parameter values to see how we could optimise folded protein yield.

\(\color{#3e3f3f}{\omega_p}\), the maximal rate of transcription, is proportional to induction level. Varying it is equivalent to varying the concentration of IPTG used to induce expression. We therefore varied the parameter to see how it effected predicted protein expression in the model. We explored \(\color{#3e3f3f}{\omega_p\in [1,10^4]}\) as these are around the bounds of realistic values [1].

(A)
(B)
Figure 7. The effect of \(\color{#3e3f3f}{\omega_p}\) on folded protein yield in (A) cytoplasmic and (B) periplasmic expression models within realistic values.

We found that the yield of folded protein followed a logarithmic increase in relation to \(\color{#3e3f3f}{\omega_p}\) (figure 7). The model predicts that at a low degree of induction (\(\color{#3e3f3f}{\omega_p}<200\)), the yield of folded protein is comparable, however at higher values the cytoplasmic yield is much lower. This correlates well with the fact that inclusion body formation increasing with induction rate, and therefore decreasing the yield of recombinant protein, is a well known issue in synthetic biology [7].


We next asked if there was any parameter we could change in the cytoplasmic expression model so that expression levels in the cytoplasm could match levels in the periplasm, correlating to some experimental step we could take

\(\color{#3e3f3f}{k_f}\), the rate of protein folding in the cytoplasm, greatly affects protein yield as our model supposes that insoluble aggregates of recombinant protein is caused by the association of protein that has not folded properly yet (matching experimental knowledge [8]). Since aggregated protein cannot fold in our model, aggregation sequesters away protein and decreases the folded protein yield. We wanted to know if we could increase \(\color{#3e3f3f}{k_f}\) in the cytoplasmic model to such a degree that cytoplasmic yield matched periplasmic yield.

Figure 8. The effect of \(\color{#3e3f3f}{\omega_p}\) on folded protein yield in (A) cytoplasmic and (B) periplasmic expression models within realistic values.

We found that periplasmic protein yield could not be matched within realistic parameter values of \(\color{#3e3f3f}{k_f}\), however the protein yield did increase with \(\color{#3e3f3f}{k_f}\) (figure 8). Thus, in order to improve protein yield in cytoplasm, we used a SHuffle strain of E. coli, which promotes disulfide bond formation in the periplasm, as our modelling predicted it would improve yield.


References

  1. Weisse, A.Y., Diego, A.O., Danos, V., Swain, P.S. (2015). Mehchanistic links between cellular trade-offs, gene expression, and growth. Proc Natl Acad Sci U S A. 112(9):E1038-47
  2. Hoffman, F., Posten, C., Rinas, U. (2001). Kinetic model of in vivo folding and inclusion body formation in recombinant Escherichia coli. Biotechnol Bioeng. 72(3):315-22
  3. Taniguchi, Y., Choi, P.J., Li, G.W., Chen, H., Babu, M., Hearn, H., Emili, A., Xie, S. (2010). Quantifying E. coli Proteome and Transcriptome with Single-Molecule Sensitivity in Single Cells. Science. 329(5991):533-538
  4. Natale, P., Bruser, T., Driessen, A.J.M. (2008). Sec- and Tat-mediated protein secretion across the bacterial cytoplasmic membrane- Distinct translocases and mechanisms. Biochemica et Biophysica Acta- Biomembranes. 1998(9):1735-1756
  5. Keyzer, J., Does, C., Driessen, A. (2002). Kinetic Analysis of the Translocation of Fluorescent Precursor Proteins into Escherichia coli Membrane Vesicles. The Journal of Biological Chemistry. 227:46059-46065
  6. BioNumbers. Key Numbers for Cell Biologists. [online] Available at: http://bionumbers.hms.harvard.edu/Includes/KeyNumbersLinks.pdf
  7. Thomas, J.G., Baneyx, F. (1996) Protein Misfolding and Inclusion Body Formation in Recombinant Escherichia coli Cells Overexpressing Heat-shock Proteins. The Journal of Biological Chemistry<. 271:11141-11147/li>
  8. Upadhyay, A.K., Murmu, A., Sing, A., Panda, A.K. (2012). Kinetics of Inclusion Body Formation and its Correlation with the Characteristics of Protein Aggregates in Escherichia coli. PLoS One. 7(3):e33951

For our physiological modelling, we used a model of subcutaneous insulin absorption developed in [1] and used it to relate the free energy of insulin hexamer formation and insulin dynamics. We then used thermodynamic modelling to make an estimate of the relative time of peak of action, and the duration of action of our novel insulin analogue (winsulin).

The authors of [1] developed a system of partial differential equations to describe the insulin infusion process. They modelled the change in three species:


Table 1. Variables in model of insulin infusion
Symbol Meaning
\[\color{#3e3f3f}{c_d}\] Insulin in dimeric form
\[\color{#3e3f3f}{c_h}\] Insulin in hexamer form
\[\color{#3e3f3f}{c_b}\] Insulin in bound form

They modelled the conversion between hexameric and dimeric insulin as follows

InsulinHexamer \(\color{#3e3f3f}{\rightleftharpoons}\) InsulinDimer

Where the forward rate was called \(\color{#3e3f3f}{P}\) and the reverse rate was \(\color{#3e3f3f}{PQ}\) where we can interpret \(\color{#3e3f3f}{P}\) as the production rate and \(\color{#3e3f3f}{Q}\) as the equilibrium constant.

The final model was as follows

\[\color{#3e3f3f}{\eqalignno{{\partial c_{d}(t,r)\over\partial t}=&\,P\left(c_{h}(t,r)-Qc_{d}(t,r)^{3}\right)-B_{d}c_{d}(t,r)\cr&+D\nabla^{2}c_{d}(t,r),\cr{\partial c_{h}(t,r)\over\partial t}=&\,-P\left(c_{h}(t,r)-Qc_{d}(t,r)^{3}\right)\cr&+D\nabla^{2}c_{h}(t,r)}}\]

Where \(\color{#3e3f3f}{P, Q, B_d, D}\) are parameters, and exogenous insulin flow is obtained by integrating the expression denoting the amount of insulin dimer entering the bloodstream: \[\color{#3e3f3f}{I_{ex}(t)=B_{d}\int\limits_{V_{sc}}c_{d}(t,r)dV.}\]

The parameters found for the different insulin analogues and their resultant insulin dynamics predicted by the model are shown in Table 2


Table 2. Parameter Values and resultant Dynamics for different insulin analogues. Values of parameters from [1] Table IV. Insulin dynamics taken from [1] Fig. 6
Insulin Analogue \(\color{#3e3f3f}{Q}\) \(\color{#3e3f3f}{D}\) \(\color{#3e3f3f}{B_d}\) Time of peak Insulin action (hours) Duration of Insulin action (hours)
Lispro, Humalog, NovoRapid \[\color{#3e3f3f}{4.75\cdot 10^{-4}}\] \[\color{#3e3f3f}{3.36\cdot 10^{-4}}\] \[\color{#3e3f3f}{2.36\cdot 10^{-2}}\] \[\color{#3e3f3f}{0.25}\] \[\color{#3e3f3f}{4}\]
Actrapid \[\color{#3e3f3f}{1.9\cdot 10^{-3}}\] \[\color{#3e3f3f}{8.4\cdot 10^{-5}}\] \[\color{#3e3f3f}{1.18\cdot 10^{-2}}\] \[\color{#3e3f3f}{0.75}\] \[\color{#3e3f3f}{8}\]
Semilente \[\color{#3e3f3f}{7.6\cdot 10^{-2}}\] \[\color{#3e3f3f}{8.4\cdot 10^{-5}}\] \[\color{#3e3f3f}{1.18\cdot 10^{-2}}\] \[\color{#3e3f3f}{1.3}\] \[\color{#3e3f3f}{11}\]
NPH \[\color{#3e3f3f}{3.04}\] \[\color{#3e3f3f}{8.4\cdot 10^{-5}}\] \[\color{#3e3f3f}{1.18\cdot 10^{-2}}\] \[\color{#3e3f3f}{4.5}\] \[\color{#3e3f3f}{16}\]

Since the parameter \(\color{#3e3f3f}{Q}\) seemed to have the most impact on insulin dynamics, we tested if there was a relationship between the two (figure 1).


Figure 1. A semilog plot of the Time of peak insulin action (hours) and Duration of Insulin action (hours) after injection of 8IU insulin as a function of the parameter Q. Nonlinear regression analysis was performed using GraphPad PRISM 7. \(R^2\) values are shown on graph.

Now, since \(\color{#3e3f3f}{Q}\) in the model formed in [1] is the equilbrium constant of the reaction InsulinHexamer \(\color{#3e3f3f}{\rightleftharpoons}\) InsulinDimer, it is related to the Gibbs free energy of the reaction by the expression \(\color{#3e3f3f}{\Delta G^{o}=-RT\ln{Q}}\), where \(\color{#3e3f3f}{R=8.314472 J K^{-1} mol^{-1}}\) is the gas constant and \(T\) is the temperature in kelvins.

Therefore if we know the Gibbs free energy of insulin hexamer formation, we can use this to find some qualitative information on the dynamics of insulin absorption using the model from [1], and thus estimate the time of peak insulin action and the duration of insulin action from thermodynamic information.

The Mutabind tool [2] computationally predicts the \(\color{#3e3f3f}{\Delta\Delta G}\) (change in binding affinity of of protein-protein interactions) of point mutations relative to a known structure. We used the server to predict the effects of our variations to proinsulin's sequence on protein-protein interactions within the insulin hexamer, and thus their effects on the \(\color{#3e3f3f}{\Delta G}\) of hexamer formation. We used PDB file 3AIY and inputted the sequence variants we had designed our winsulin with. Since the B chains are buried at the center of the insulin hexamer (see 3AIY), we analysed the effects of the mutations in this chain on hexamer stability (figure 2)


Figure 2. Alignment of winsulin and human insulin. Residues highlighted yellow were sequence variants inputted into the Mutabind program.

Mutabind predicted that all of our sequence changes would decrease the stability of the insulin hexamer for our analogue. Results are shown in table 3


Table 3. Mutabind results.
Mutation \(\color{#3e3f3f}{\Delta\Delta G_{bind} (kcal mol^{-1})}\)
\[\color{#3e3f3f}{H10D}\] \[\color{#3e3f3f}{0.51}\]
\[\color{#3e3f3f}{T27S}\] \[\color{#3e3f3f}{0.57}\]
\[\color{#3e3f3f}{K295}\] \[\color{#3e3f3f}{0.62}\]

Where \(\color{#3e3f3f}{\Delta\Delta G_{bind} (kcal mol^{-1})}\) is the predicted change in binding affinity induced by a mutation. A positive result corresponds to destabilising mutations, so the hexamer formation of winsulin will be less stable than that of human insulin.

This corresponds to a decrease in \(\color{#3e3f3f}{Q}\), meaning we predict our winsulin analogue will be relatively fast acting, compared to regular human insulin. Human insulin activity peaks in 2-4 hours and lasts for 6-8 hours [3] so this would make winsulin a rapid-acting analogue.

Although this is a crude estimate, it does give us some qualitative information on the action profile of our novel winsulin

References

  1. Tarin, C., Teufel, E., Pico, J., Bondia, J., Pfleiderer, H.J. (2005). Comprehensive pharmacokinetic model of insulin Glargine and other insulin formulations. IEEE Transactions on Biomedical Engineering, vol. 52, no. 12, pp. 1994-2005
  2. Li, M., Simonetti, F. L., Goncearenco, A., & Panchenko, A. R. (2016). MutaBind estimates and interprets the effects of sequence variants on protein–protein interactions. Nucleic Acids Research, 44(Web Server issue), W494–W501. http://doi.org/10.1093/nar/gkw374
  3. Diabetes Education Online. 2017. Types of Insulin. [ONLINE] Available at: https://dtc.ucsf.edu/types-of-diabetes/type2/treatment-of-type-2-diabetes/medications-and-therapies/type-2-insulin-rx/types-of-insulin/.
Put LAO content here
Put USA content here
Put EGY content here
Put GBR content here
Put CAN content here
Put IRL content here
Put NLD content here
Put UKR content here
Put AUT content here
Put GEO content here
Put ESP content here
Put TUR content here
Put CHN content here
Put SYR content here
Put ISR content here
Put PAK content here
Put PAK content here
Put NPL content here
Put ARE content here
Put IND content here
Put DOM content here
Put MLI content here
Put GTM content here
Put YEM content here
Put VNM content here
Put SLV content here
Put SDN content here
Put PHL content here
Put KHM content here
Put CRI content here
Put ETH content here
Put NGA content here
Put GHA content here
Put LKA content here
Put CMR content here
Put UGA content here
Put KEN content here
Put KEN content here
Put ECU content here
Put IDN content here
Put COG content here
Put TZA content here
Put PER content here
Put ZMB content here
Put BRA content here
Put MUS content here
Put AUS content here
Put ZAF content here
Put ARG content here
Put NZL content here