Difference between revisions of "Team:Harvard/Model"

Line 2: Line 2:
 
<html>
 
<html>
  
<div >
+
<div class="column full_size description">
 
<h1>Model</h1>
 
<h1>Model</h1>
  

Revision as of 13:01, 13 August 2017

Model

Overview

As part of our overarching project goal to optimize the production of curli fibers, we took a closer look at the cellular pathway involved in this process. As protein expression level can be tuned relatively easily with ribosome binding sites, we chose to focus on optimizing the stoichiometric ratios of protein expression for each of the csg proteins involved in the production pathway. To this end, we developed a mechanistic model of the general curli pathway that can be applied to any system involving the curli proteins.

To constrain our model, we defined the following objectives:

  1. Maximize the concentration of extracellular fibrils
  2. Minimize the concentration of intracellular fibrils
  3. Maximize cellular resource usage efficiency
The first objective refers to our desired product: the secreted and polymerized csgA.

The pathway has been broken down into the following four modules:
Transcription
Translation
Periplasmic Export
Extracellular Secretion

Spatial parameters, heat and diffusion gradients, etc. will not be considered in this model to avoid partial differential equations.

2 Gene Expression

2.1 Transcription

The naturally occurring genes corresponding to curli are organized into csgBAC and csgDEFG operons. csgD is a regulatory protein, and thus is not applicable for the expression of foreign plasmids. Additionally, due to the additional protein domains fused to the csgA monomer, it is placed on a plasmid on its own (\(g_{csgA}\)). The remaining genes are placed on another plasmid (\(g_{csgCEFG}\)). The rate of transcription is primarily governed by the plasmid copy number and promoter strength. Although RNA polymerase and a number of other transcription factors are also involved in the transcription of mRNA, transcription factor binding achieves equilibrium much faster than transcription, translation, and protein accumulation, so it can be at considered to be at steady state on the time scale of proteins. Thus only the concentration of the plasmids are assumed to have a large impact on the rate of transcription. $$g_{csgA} + RNA_{pol} \overset{\alpha_{1}}{\rightarrow} g_{csgA} + RNA_{pol} + mRNA_{csgA}$$ $$g_{csgBCEFG} + RNA_{pol} \overset{\alpha_{2}}{\rightarrow} g_{csgBCEFG} + RNA_{pol} + mRNA_{csgBCEFG}$$ The rate of transcript degradation is dependent upon transcript stability, which is in turn affected by a number of factors. Although we have introduced these genes on foreign plasmids, the mRNA half-life here is assumed to be that of the mRNA transcripts of the corresponding genes naturally present in E. coli. $$mRNA_{csgA} \overset{\zeta_{1}}{\rightarrow} \varnothing$$ $$mRNA_{csgBCEFG} \overset{\zeta_{1}}{\rightarrow} \varnothing$$ The degradation rate, \(\zeta_{n}\), can be described as an exponential decay function of time mRNA half-life, \(h_n\): $$\zeta_{n} = e^{h_nt}$$

2.2 Translation

Each of the coding sequences in the two transcripts described above are preceded by an RBS sequence that determines the rate of translation of each protein. The relative RBS strengths determine the stoichiometry between the proteins involved in the curli pathway. Translation also involves a number of players, including the ribosome and tRNAs, but as the kinetic parameters regarding the rate of translation available in literature are based $$mRNA_{csgA} \overset{\beta_{1}}{\rightarrow} mRNA_{csgA} + csgA_{cyt}$$ $$mRNA_{csgBCEFG} \overset{\beta_{2}}{\rightarrow} mRNA_{csgBCEFG} + csgB_{cyt}$$ $$mRNA_{csgBCEFG} \overset{\beta_{3}}{\rightarrow} mRNA_{csgBCEFG} + csgC_{cyt}$$ $$mRNA_{csgBCEFG} \overset{\beta_{4}}{\rightarrow} mRNA_{csgBCEFG} + csgE_{cyt}$$ $$mRNA_{csgBCEFG} \overset{\beta_{5}}{\rightarrow} mRNA_{csgBCEFG} + csgF_{cyt}$$ $$mRNA_{csgBCEFG} \overset{\beta_{6}}{\rightarrow} mRNA_{csgBCEFG} + csgG_{cyt}$$

3 Translocation

Although the ultimate destination of the csgA monomer is the extracellular space, not all of the other proteins involved in the pathway have the same fate. csgB and csgF do operate in the extracellular space, but csgC and csgE are chaperone proteins that remain in the periplasm, whereas csgG forms a channel in the outer membrane.

3.1 Periplasmic export

Since none of the curli proteins remain in the cytoplasm, all must translocate into the cell's periplasm. The mechanism by which this occurs is the Sec secretion pathway. The main actors in this secretion pathway are SecYEG, the protein conducting channel (PCC), SecA which acts as an ATPase driving the translocation, and SecB, a chaperone protein that keeps proteins in an unfolded state (Driessen et al., 2007). As a protein emerges from the ribosome, SecB, a homotetramer, binds and stabilizes it in its unfolded conformation. SecB binds to SecA, a homodimer which also recruits SecYEG to assemble a dimeric PCC. Here, we simplify the interactions into three steps: the binding of the SecB homotetramer to the csg protein to form a \(SecB:csgX\) complex, the subsequent binding between the \(SecB:csgX\) complex with SecA, and finally the translocation of the protein from the cytoplasm to the periplasm. $$csgX_{cyt} + 4 SecB \underset{\gamma_{-1}}{\overset{\gamma_{1}}{\rightleftharpoons}} SecB:csgX$$ $$SecB:csgX + 2 SecA + 2 SecYEG \overset{\gamma_{2}}{\rightarrow} Sec:csgX$$ $$ Sec:csgX \overset{\gamma_{3}}{\rightarrow} 4 SecB + 2 SecA + SecYEG + csgX_{per}$$ The rate of binding and secretion is assumed to be conserved across the different proteins due to the similarity in mechanism and homology in the Sec signal sequences. Thus, instead of listing the same interactions for every protein, the generic \(csgX_{cyt}\) and \(csgX_{per}\) are used to denote the csg proteins in the cytoplasm and periplasm, respectively.

3.2 Extracellular secretion

Analysis of the crystal structure of csgG and has revealed that it assembles into a double-nonameric form in D9 symmetry (Taylor and Matthews 2015). CsgE has also been shown to form a nonamer at the base of the csgG structure in the periplasm, providing selectivity for the substrates that are secreted (Goyal et al., 2014). CsgG and csgE participate in the translocation fo csgF into the extracellular matrix, which then folds and binds csgG to the membrane. Meanwhile, csgC interacts with csgA and csgB monomers to prevent the formation of oligomers (Taylor and Matthews 2015). When the monomers interact with csgE, they become trapped in the periplasmic cavity and are transported across the outer membrane. CsgB then interacts with csgF to initiate the nucleation of csgA fibers. $$9 csgE_{per} \underset{\delta_{-1}}{\overset{\delta_{1}}{\rightleftharpoons}} csgE_{9}$$ $$csgF_{per} \overset{\delta_{2}}{\rightarrow} csgF_{ECM}$$ $$9 csgG_{per} \underset{\delta_{-3}}{\overset{\delta_{3}}{\rightleftharpoons}} csgG_{9}$$ $$ csgG_{9} + 2 csgF_{ECM} + csgE_{9} \underset{\delta_{-4}}{\overset{\delta_{4}}{\rightleftharpoons}} csgGEF$$ $$csgA_{per} + csgC_{per} \overset{\delta_{5}}{\rightarrow} csgC:csgA$$ $$csgB_{per} + csgC_{per} \overset{\delta_{5}}{\rightarrow} csgC:csgB$$ $$csgC:csgA + csgGEF \overset{\delta_{6}}{\rightarrow} csgGEF + csgC_{per} + csgA_{ECM}$$ $$csgC:csgB + csgGEF \overset{\delta_{7}}{\rightarrow} csgGEF + csgC_{per} + csgB_{ECM}$$

4 Diffusion

The rate of translocation of proteins from the periplasm to the extracellular matrix will be assumed to follow Fick's first law of diffusion: $$ J = -D \nabla \phi $$ Here, \(J\) represents the "diffusion flux", or diffusion per unit area per unit time, \(D\) is the diffusion coefficient, which we will take from literature, and \(\nabla \phi \) is the concentration gradient. However, to avoid partial derivatives, we will simplify the equation above into the following: $$ J = -D \frac{[csgX_{per}] - [csgX_{ECM}]}{\omega}$$ where \([csgX_{per}] - [csgX_{ECM}]\) represents the difference in concentration of a protein between the periplasm and the ECM, and \( \omega \) represents the width of the outer membrane. The equation now gives us the moles per unit area per unit time. To obtain the number of molecules per unit time, we multiply the equation above by the surface area of the cell, which we will represent as \( S \), as well as avogadro's number, \( N_A \): $$ J = -D \frac{[csgX_{per}] - [csgX_{ECM}]}{\omega} S N_A$$ The equations above will be used to represent the rate of translocation of csgF, csgA, and csgB from the periplasm to the extracellular matrix, as their mode of transport is known to be selective diffusion.

5 Aggregation and Polymerization

One of the most useful properties of curli fibers is their self-assembly and aggregation. However, this feature also poses a problem when curli is used as a biomanufacturing platform, as overproduction of csgA that outpaces the E. coli cell's natural secretion machinery's maximum efficiency can lead to accumulation and aggregation within the cytoplasm, causing the cell to lyse.

Fibril formation can be broken down into two main reactions: nucleation, describing the interaction between two csgA monomers to form a seed for further fiber growth, and elongation of fibrils, which refers to the addition of csgA monomers to an existing fiber. These two steps proceed until the species involved reach an equilibrium. $$2 csgA_m \underset{\epsilon_{-1}}{\overset{\epsilon_{1}}{\rightleftharpoons}} F$$ $$F + csgA_m \underset{\epsilon_{-2}}{\overset{\epsilon_{2}}{\rightleftharpoons}} F$$ \(csgA_m\) and \(F\) represent a free-floating csgA monomer and a fiber of arbitrary length, respectively. Nucleation is considered the rate-limiting step in polymerization (Chapman 2008), whereas subsequent elongation occurs at a relatively constant rate regardless of fibril length. Thus, all fibers are grouped as a single species. As polymerization can occur anywhere where the concentration of csgA is high enough, the reactions involved must be taken into consideration in all three compartments of the cell.

6 Differential Equations

Transcripts

$$\frac{d[mRNA_{csgA}]}{dt} = \alpha_{1}[g_{csgA}][RNA_{pol}] - \zeta_{1}[mRNA_{csgA}]$$ $$\frac{d[mRNA_{csgBCEFG}]}{dt} = \alpha_{2}[g_{csgBCEFG}][RNA_{pol}] - \zeta_{2}[mRNA_{csgBCEFG}]$$

Cytoplasmic proteins

$$\frac{d[csgA_{cyt}]}{dt} = \beta_{1}[mRNA_{csgA}] - \gamma_{1}[csgA_{cyt}][SecB]^4 + \gamma_{-1}[secB:csgA] - \epsilon_{1}[csgA_{cyt}]^2 + \epsilon_{-1}[F_{cyt}] - \epsilon_{2}[F_{cyt}][csgA_{cyt}] + \epsilon_{-2}[F_{cyt}]$$ $$\frac{d[SecB:csgA]}{dt} = \gamma_{1}[csgA_{cyt}][SecB]^4 - \gamma_{-1}[SecB:csgA] - \gamma_{2}[SecB:csgA][SecA]^2[SecYEG]^2$$ $$\frac{d[Sec:csgA]}{dt} = \gamma_{2}[SecB:csgA][SecA]^2[SecYEG]^2 - \gamma_{3}[Sec:csgA]$$ $$\frac{d[F_{cyt}]}{dt} = \epsilon_{1}[csgA_{cyt}]^2 - \epsilon_{-1}[F_{cyt}]$$ $$\frac{d[csgB_{cyt}]}{dt} = \beta_{2}[mRNA_{csgBCEFG}] - \gamma_{1}[csgB_{cyt}][SecB]^4 + \gamma_{-1}[secB:csgB]$$ $$\frac{d[SecB:csgB]}{dt} = \gamma_{1}[csgB_{cyt}][SecB]^4 - \gamma_{-1}[SecB:csgB] - \gamma_{2}[SecB:csgB][SecA]^2[SecYEG]^2$$ $$\frac{d[Sec:csgB]}{dt} = \gamma_{2}[SecB:csgB][SecA]^2[SecYEG]^2 - \gamma_{3}[Sec:csgB]$$ $$\frac{d[csgC_{cyt}]}{dt} = \beta_{3}[mRNA_{csgBCEFG}] - \gamma_{1}[csgC_{cyt}][SecB]^4 + \gamma_{-1}[secB:csgC]$$ $$\frac{d[SecB:csgC]}{dt} = \gamma_{1}[csgC_{cyt}][SecB]^4 - \gamma_{-1}[SecB:csgC] - \gamma_{2}[SecB:csgC][SecA]^2[SecYEG]^2$$ $$\frac{d[Sec:csgC]}{dt} = \gamma_{2}[SecB:csgC][SecA]^2[SecYEG]^2 - \gamma_{3}[Sec:csgC]$$ $$\frac{d[csgE_{cyt}]}{dt} = \beta_{4}[mRNA_{csgBCEFG}] - \gamma_{1}[csgE_{cyt}][SecB]^4 + \gamma_{-1}[secB:csgE]$$ $$\frac{d[SecB:csgE]}{dt} = \gamma_{1}[csgE_{cyt}][SecB]^4 - \gamma_{-1}[SecB:csgE] - \gamma_{2}[SecB:csgE][SecA]^2[SecYEG]^2$$ $$\frac{d[Sec:csgE]}{dt} = \gamma_{2}[SecB:csgE][SecA]^2[SecYEG]^2 - \gamma_{3}[Sec:csgE]$$ $$\frac{d[csgF_{cyt}]}{dt} = \beta_{5}[mRNA_{csgBCEFG}] - \gamma_{1}[csgF_{cyt}][SecB]^4 + \gamma_{-1}[secB:csgF]$$ $$\frac{d[SecB:csgF]}{dt} = \gamma_{1}[csgF_{cyt}][SecB]^4 - \gamma_{-1}[SecB:csgF] - \gamma_{2}[SecB:csgF][SecA]^2[SecYEG]^2$$ $$\frac{d[Sec:csgF]}{dt} = \gamma_{2}[SecB:csgF][SecA]^2[SecYEG]^2 - \gamma_{3}[Sec:csgF]$$ $$\frac{d[csgG_{cyt}]}{dt} = \beta_{6}[mRNA_{csgBCEFG}] - \gamma_{1}[csgG_{cyt}][SecB]^4 + \gamma_{-1}[secB:csgG]$$ $$\frac{d[SecB:csgG]}{dt} = \gamma_{1}[csgG_{cyt}][SecB]^4 - \gamma_{-1}[SecB:csgG] - \gamma_{2}[SecB:csgG][SecA]^2[SecYEG]^2$$ $$\frac{d[Sec:csgG]}{dt} = \gamma_{2}[SecB:csgG][SecA]^2[SecYEG]^2 - \gamma_{3}[Sec:csgG]$$ $$\begin{align} \frac{d[SecB]}{dt} & = \gamma_{3} \{[Sec:csgA] + [Sec:csgB] + [Sec:csgC] + [Sec:csgE] + [Sec:csgF] + [Sec:csgG]\} \\ & - \gamma_{1}[SecB]^4\{[csgA_{cyt}] + [csgB_{cyt}] + [csgC_{cyt}] + [csgE_{cyt}] + [csgF_{cyt}] + [csgG_{cyt}]\} \\ & + \gamma_{-1} \{[SecB:csgA] + [SecB:csgB] + [SecB:csgC] + [SecB:csgE] + [SecB:csgF] + [SecB:csgG]\} \end{align}$$ $$\begin{align} \frac{d[SecA]}{dt} & = \gamma_{3} \{[Sec:csgA] + [Sec:csgB] + [Sec:csgC] + [Sec:csgE] + [Sec:csgF] + [Sec:csgG]\} \\ & - \gamma_{2}[SecA]^2[SecYEG]^2\{[SecB:csgA] + [SecB:csgB] + [SecB:csgC] + [SecB:csgE] + [SecB:csgF] + [SecB:csgG]\} \end{align}$$ $$\begin{align} \frac{d[SecYEG]}{dt} & = \gamma_{3} \{[Sec:csgA] + [Sec:csgB] + [Sec:csgC] + [Sec:csgE] + [Sec:csgF] + [Sec:csgG]\} \\ & - \gamma_{2}[SecA]^2[SecYEG]^2\{[SecB:csgA] + [SecB:csgB] + [SecB:csgC] + [SecB:csgE] + [SecB:csgF] + [SecB:csgG]\} \end{align}$$

Periplasmic Proteins

$$\frac{d[csgE_{9}]}{dt} = \delta_{1}[csgE_{per}]^9 - \delta_{-1}[csgE_{9}]$$ $$\frac{d[csgG_{9}]}{dt} = \delta_{3}[csgG_{per}]^9 - \delta_{-3}[csgG_{9}] - \delta_{4}[csgE_{9}][csgF_{ECM}]^2[csgG_{9}] + \delta_{-4}[csgGEF]$$ $$\frac{d[csgGEF]}{dt} = \delta_{4}[csgE_{9}][csgF_{ECM}]^2[csgG_{9}] - \delta_{-4}[csgGEF]$$ $$\frac{d[csgA_{per}]}{dt} = \gamma_{3}[Sec:csgA] - \delta_{5}[csgA_{per}][csgC_{per}] - \epsilon_{1}[csgA_{per}]^2 + \epsilon_{-1}[F_{per}] - \epsilon_{2}[F_{per}][csgA_{per}] + \epsilon_{-2}[F_{per}]$$ $$\frac{d[F_{per}]}{dt} = \epsilon_{1}[csgA_{per}]^2 - \epsilon_{-1}[F_{per}]$$ $$\frac{d[csgB_{per}]}{dt} = \gamma_{3}[Sec:csgB] - \delta_{5}[csgB_{per}][csgC_{per}] $$ $$\frac{d[csgC_{per}]}{dt} = \gamma_{3}[Sec:csgC] - \delta_{5}[csgA_{per}][csgC_{per}] - \delta_{5}[csgA_{per}][csgC_{per}] + D \frac{[csgC:csgA] - [csgA_{ECM}]}{\omega} S N_A + D \frac{[csgC:csgB] - [csgB_{ECM}]}{\omega} S N_A$$ $$\frac{d[csgC:csgA]}{dt} = \delta_{5}[csgA_{per}][csgC_{per}] - D \frac{[csgC:csgA] - [csgA_{ECM}]}{\omega} S N_A $$ $$\frac{d[csgC:csgB]}{dt} = \delta_{5}[csgB_{per}][csgC_{per}] - D \frac{[csgC:csgB] - [csgB_{ECM}]}{\omega} S N_A $$ $$\frac{d[csgE_{per}]}{dt} = \gamma_{3}[Sec:csgE] $$ $$\frac{d[csgF_{per}]}{dt} = \gamma_{3}[Sec:csgF] - \delta_{2}[csgF_{per}]$$ $$\frac{d[csgG_{per}]}{dt} = \gamma_{3}[Sec:csgG] - \delta_{3}[csgG_{per}]^9$$

Extracellular Secretion

$$\frac{d[csgF_{ECM}]}{dt} = D \frac{[csgF_{per}] - [csgF_{ECM}]}{\omega} S N_A$$ $$\frac{d[csgA_{ECM}]}{dt} = D \frac{[csgC:csgA] - [csgA_{ECM}]}{\omega} S N_A - \epsilon_{1}[csgA_{ECM}]^2 + \epsilon_{-1}[F_{ECM}] - \epsilon_{2}[F_{ECM}][csgA_{ECM}] + \epsilon_{-2}[F_{ECM}]$$ $$\frac{d[F_{ECM}]}{dt} = \epsilon_{1}[csgA_{ECM}]^2 - \epsilon_{-1}[F_{ECM}]$$ $$\frac{d[csgB_{ECM}]}{dt} = D \frac{[csgC:csgB] - [csgB_{ECM}]}{\omega} S N_A$$

6 Rate constants

Symbol Definition Value Units Reference
\(\alpha_1\) Rate of transcription of csgA 0.0921 \(sec^{-1}\) Proshkin, Sergey, et al. 2010
\(\alpha_2\) Rate of transcription of csgB-G 0.0214 \(sec^{-1}\) Proshkin, Sergey, et al.
\(\beta_0\) Basal protein translation rate 0.1 \(sec^{-1}\) Proshkin, Sergey, et al.
\(\gamma_1\) Rate of binding of SecB to csgX 0.25 \(uM^{-1}sec^{-1}\) Agarwal 2010
\(\gamma_{-1}\) Rate of dissociation of SecB:csgX 0.025 \(sec^{-1}\) Agarwal 2010
\(\gamma_2\) Rate of formation of SecABYEG:csgX complex 0.0085 \(uM^{-1}sec^{-1}\) Agarwal 2010
\(\gamma_3\) Rate of translocation of csgX from cytoplasm to periplasm 1.00 \(sec^{-1}\) Agarwal 2010
\(\delta_1\) Rate of formation of \(csgE_9\) nonamer 0.76 \(uM^{-8}sec^{-1}\) Lee et. al 2015
\(\delta_{-1}\) Rate of dissociation of \(csgE_9\) nonamer \(28 \times 10^{-3}\) \(sec^{-1}\) Lee et. al 2015
\(\delta_{3}\) Rate of formation of \(csgG_9\) nonamer 0.76 \(uM^{-8}sec^{-1}\) Lee et. al 2015
\(\delta_{-3}\) Rate of dissociation of \(csgG_9\) nonamer \(28 \times 10^{-3}\) \(sec^{-1}\) Lee et. al 2015
\(\delta_{4}\) Rate of formation of csgGEF complex 0.0384 \(uM^{-1}sec^{-1}\) Agarwal 2010
\(\delta_{-4}\) Rate of dissociation of csgGEF complex TBD \(sec^{-1}\) Agarwal 2010
\(\delta_{5}\) Rate of formation of csgC:csgX complex 0.25 \(uM^{-1}sec^{-1}\) Agarwal 2010
\(D\) Diffusion coefficient \(10 \times 10^{-9}\) \(m^2s^{-1}\) TBD
\(\omega\) E. coli outer membrane thickness \(10 \times 10^{-8}\) \(m\) TBD
\(SA\) E. coli outer membrane surface area \(4.42 \times 10^{-12}\) \(m^2\) TBD

7 References

  1. Agarwal, Anup. Computational and experimental approaches to enhance extracellular secretion of recombinant proteins in Escherichia coli. Diss. University of Delaware, 2010.
  2. Lee, A. A., et al. "Dissecting the self-assembly kinetics of multimeric pore-forming toxins." Journal of The Royal Society Interface 13.114 (2016): 20150762.
  3. Proshkin, Sergey, et al. "Cooperation between translating ribosomes and RNA polymerase in transcription elongation." Science 328.5977 (2010): 504-508.
  4. Stewart, Philip S. "Diffusion in biofilms." Journal of bacteriology 185.5 (2003): 1485-1491.
  5. Taylor, Jonathan D., and Steve J. Matthews. "New insight into the molecular control of bacterial functional amyloids." Frontiers in cellular and infection microbiology 5 (2015).