Difference between revisions of "Team:Cologne-Duesseldorf/Model"

Line 6: Line 6:
 
<h1>Model</h1>
 
<h1>Model</h1>
 
<div id="ToC"></div>
 
<div id="ToC"></div>
 +
<h2>Summary</h2>
 +
<p>To enable full control over the peroxisomal protein import we predicted the structure for the yeast peroxisomal matrix import protein. In this structure we exchanged amino acids in the TPR regions using PyMol to create an orthogonal binding pocket. Using AMBER and VMD we predicted a binding pocket which binds a non wild type but not the wild type targeting signal.
 +
<br>
 +
To optimize our proof-of-concept project we modeled the <a href="#MetabolicModel">Nootkatone pathway</a> using python and showed the advantages of  cellular compartmentalization. Finally we were able to create a yeast strain optimized for the production of the pathway precursor by using <a href="#OptKnock">OptFlux analysis</a>. </p>
 +
 
<h2>Structural model</h2>
 
<h2>Structural model</h2>
 
<h3>Infrastructure and modelling software</h3>
 
<h3>Infrastructure and modelling software</h3>
Line 59: Line 64:
  
  
<h2>Nootkatone Metabolic Model</h2>
+
<h2 id="MetabolicModel">Nootkatone Metabolic Model</h2>
 
<h3>Overview</h3>
 
<h3>Overview</h3>
 
<p>In the following we present our model of the nootkatone biosynthesis pathway. We modeled the system in order to understand how to maximize nootkatone production.  We start with an oversimplified model to give you a sense for the behavior of the enzymes in the pathway. Then we continue with a model introducing a function penalizing high concentrations of toxic products. As the toxicity is one of two main culprits of nootkatone production, we further show that the production inside a peroxisome can greatly increase nootkatone yield. Afterwards we discuss possible ways to overcome the shortage of the cellular FPP pool.</p>
 
<p>In the following we present our model of the nootkatone biosynthesis pathway. We modeled the system in order to understand how to maximize nootkatone production.  We start with an oversimplified model to give you a sense for the behavior of the enzymes in the pathway. Then we continue with a model introducing a function penalizing high concentrations of toxic products. As the toxicity is one of two main culprits of nootkatone production, we further show that the production inside a peroxisome can greatly increase nootkatone yield. Afterwards we discuss possible ways to overcome the shortage of the cellular FPP pool.</p>
Line 500: Line 505:
 
<p>$$\text{and lower bound } \leq v \leq \text{upper bound}$$</p>
 
<p>$$\text{and lower bound } \leq v \leq \text{upper bound}$$</p>
 
<p>This is a linear optimization which due to the convex solution space can be calculated efficiently with linear programming.</p>
 
<p>This is a linear optimization which due to the convex solution space can be calculated efficiently with linear programming.</p>
<h4>OptKnock</h4>
+
<h4 id="OptKnock">OptKnock</h4>
 
<p>Metabolic engineering faces two big problems. Firstly the precursor molecule for the wanted reaction can only be found in trace amounts in the cell, leading to minor yields of the wanted product. Secondly adding pathways to the metabolism of an organism increases the metabolic stress on that organism. Thus if the cell loses the ability to produce the desired compound it can overgrow the other cells in a large-scale bioreactor setting. OptKnock (<a href="http://onlinelibrary.wiley.com/doi/10.1002/bit.10803/"><abbr title="Anthony P. Burgard 2003 - Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization">Anthony P. Burgard 2003</abbr></a>) is a two step optimization in which biomass production is coupled to the production of a desired product by calculating a set of gene knockouts, solving both problems above.In principle the algorithm works like this:</p>
 
<p>Metabolic engineering faces two big problems. Firstly the precursor molecule for the wanted reaction can only be found in trace amounts in the cell, leading to minor yields of the wanted product. Secondly adding pathways to the metabolism of an organism increases the metabolic stress on that organism. Thus if the cell loses the ability to produce the desired compound it can overgrow the other cells in a large-scale bioreactor setting. OptKnock (<a href="http://onlinelibrary.wiley.com/doi/10.1002/bit.10803/"><abbr title="Anthony P. Burgard 2003 - Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization">Anthony P. Burgard 2003</abbr></a>) is a two step optimization in which biomass production is coupled to the production of a desired product by calculating a set of gene knockouts, solving both problems above.In principle the algorithm works like this:</p>
 
<p>
 
<p>
Line 535: Line 540:
 
</div>
 
</div>
 
<h3>Results</h3>
 
<h3>Results</h3>
<p>...</p>
+
<p>We analyzed the FPP pool by using Geranyltranstransferase (GRTT) as the reaction to be optimized. GRTT takes geranyl diphosphate and isopentenyl diphosphate and converts them to farnesyl pyrophosphate and diphosphate. The two results below show the maximal GRTT production with a minimal biomass yield of 0.1 and 0.2 respectively. </p>
 
<img class="max-width"  src="https://static.igem.org/mediawiki/2017/c/cd/T--Cologne-Duesseldorf--OptKnock.svg">
 
<img class="max-width"  src="https://static.igem.org/mediawiki/2017/c/cd/T--Cologne-Duesseldorf--OptKnock.svg">
 
<img class="max-width"  src="https://static.igem.org/mediawiki/2017/0/02/T--Cologne-Duesseldorf--OptKnock-Biomass-Up.svg">
 
<img class="max-width"  src="https://static.igem.org/mediawiki/2017/0/02/T--Cologne-Duesseldorf--OptKnock-Biomass-Up.svg">
<p>...</p>
+
<p>Compared to the wild type FPP flux (0.0013) the maximal fluxes that could be achieved with four knockouts were 0.61
 +
and 0.29, respective to the minimal biomass yield of 0.1 and 0.2. This roughly corresponds to a 480- and 230-fold increase in FPP flux. While we do not think that these results are close to the actual biological truth, as a 230- or 480-fold increase is rather extreme, we did notice an accumulation of knockouts in genes related to the pentose phosphate pathway (TKT2 = Transketolase,  G6PDH2r = Glucose-6-phosphate dehydrogenase) or citric acid cycle (ALATA_L =  L-alanine transaminase). Further reactions like ADK3 (Adentylate kinase (GTP)), which is indirectly involved in the TCA by supplying GDP or IPPSm (2 isopropylmalate synthase mitochondrial) that takes acetyl-CoA as a substrate lead us to the conclusion that the cellular FPP pool can be increased by shifting to a fatty acid metabolism. This is further solidified by the fact that the FPP precursor isopentenyl diphosphate ultimately is build from acetyl-CoA <abbr title="2017, Sebastián Rubat - Increasing the intracellular isoprenoid pool in Saccharomyces cerevisiae by structural fine-tuning of a bifunctional farnesyl diphosphate synthase">Sebastián Rubat 2017</abbr>.</p>
 +
 
 +
 
 +
<!-- reactions:
 +
Transporter:
 +
ERGSTt: Ergosterol reversible transport (knockout more ergosterol in er?)
 +
SUCFUMtm: Succinate fumarate transport mitochondrial (knockout leads to more fumarat in mitochondria and more succinat in cytosol)
 +
ACt2r: Acetate reversible transport via proton symport
 +
ATPtm_H: ADPATP transporter mitochondrial
 +
 
 +
 
 +
SLFAT: Sulfate adenylyltransferase ADP (ADP+sulfat <> phosphat + adenylyl sulfate); Sulfur uptake?
 +
RDNR4: Ribonucleoside-diphosphate reductase (UDP): formation of deoxyribonucleotides for DNA synthesis???
 +
[Thioredoxin_red + UDP <>Thioredoxin_ox + H2O + DUDP
 +
 
 +
 
 +
CTPS1: CTP synthase NH3: nucleotide biosynthesis
 +
[ATP + NH4 + UTP <> ADP + CTP + H+ +Pi]
 +
 
 +
IPPSm: 2 isopropylmalate synthase mitochondrial, pyruvate metabolism, needs acetyl-coa; Leucin biosynthesis
 +
[3-Methyl-2-oxobutanoate + Acetyl-CoA + H2O <> 3-Carboxy-3-hydroxy-4-methylpentanoate + CoA + H+]
 +
 
 +
 
 +
 
 +
 
 +
Sugar stuff:
 +
TKT2: Transketolase (pentose phosphate pathway / calvin cycle)
 +
G6PDH2r: Glucose-6-phosphate dehydrogenase (pentose phosphate pathway)
 +
ALATA_L:  L-alanine transaminase: 2-oxoglutarate (TCA) + L-Alanine <> L-Glutamate + Pyruvate
 +
 
 +
ADK3: amp+gtp <> adp+gdp
 +
 
 +
 
 +
-Nkl run-
 +
PYRDC
 +
ALCD2x_copy2
 +
MDH
 +
 
 +
-->
 +
 
 +
 
 
</article>
 
</article>
 
</body>
 
</body>

Revision as of 14:17, 27 October 2017

Model

Summary

To enable full control over the peroxisomal protein import we predicted the structure for the yeast peroxisomal matrix import protein. In this structure we exchanged amino acids in the TPR regions using PyMol to create an orthogonal binding pocket. Using AMBER and VMD we predicted a binding pocket which binds a non wild type but not the wild type targeting signal.
To optimize our proof-of-concept project we modeled the Nootkatone pathway using python and showed the advantages of cellular compartmentalization. Finally we were able to create a yeast strain optimized for the production of the pathway precursor by using OptFlux analysis.

Structural model

Infrastructure and modelling software

For the targeted mutagenesis of PEX5 we used molecular dynamic techniques. Professor Gohlke of the University of Duesseldorf kindly provided us with the necessary infrastructure and advisors to perform our calculations.
Molecular dynamics are a tool simulate the cells environment, regarding its temperature and ion contents, to calculate protein-protein interactions. We simulated the movement of our peroxisomal targeting signal (PTS) within the binding pocket of the PEX5 receptor.
The molecular dynamics software AMBER (Assisted Model Building with Energy Refinement) uses so called force fields to calculate the movement of each atom. For that we start with an equilibrated state which means that the PTS and every other molecule sit at the most favorable position. After equilibration we proceed with the actual simulation. The simulations only differ in the temperature of the environment which is more or less 300 K in each simulation. The whole workflow is depicted in the following graphic.

Workflow for the molecular dynamics simulations

Three dimensional structure of the yeast PEX5 receptor

Currently there is no crystal structure of the yeasts PEX5 and due to that we had to predict its structure. To achieve that, we used a program provided by Daniel Mulnaes that uses existing files of the pdb database to create the structural prediction of the input sequence. One of those is the crystal structure of the TPR domains of the human PEX5 receptor interacting with the PTS1 (PDB-entry: 1FCH).

PyMol's mutagenesis wizard'

PyMol provides a built-in function called 'mutagenesis wizard' making it possible to introduce site-directed mutations in the 3D structure. With this tool we exchanged amino acids in the TPR regions to create beneficial interactions between receptor and targeting signal. Our new receptor should only recognize our new peptide and not the wild-type PTS1 and the wild-type receptor should not recognize the new designed PTS. In order to achieve this we calculated vacuum electrostatics of the receptor and the peptide, which can be seen in the following figure.

TPR domains of the yeasts PEX5 receptor.
Calculated vacuum electrostatics of the PEX5 receptor with the PTS.

This visualization shows the amino acid properties, so one can detect for example polar or hydrophobic regions and furthermore Pymol provides a feature to identify interacting amino acids. With this knowledge one can set the mutations in an educated way to increase the possibility that the mutagenesis leads to the desired outcome. When we designed a possible new protein-protein interaction complex, we proceeded with molecular dynamics.

Equilibration and molecular dynamics simulation

Once the mutations were set and the peptide was build, we equilibrated and simulated our experiment. After a couple of days the simulations were finished and we could start with the data evaluation.

Graphical visualization

VMD (Visual Molecular Dynamics) uses the coordinate files generated by AMBER to generate a graphical illustration of the simulation. This already gives a hint how promising the results are but for scientific evaluation we proceeded with numerical evaluation.

Diffusion

To get numerical data out of the coordinate files, we then calculated the diffusion of the PTS from its starting point in the equilibrated state. A low value indicates stable binding while a high value indicates unfavorable interactions. After calculation we used R to generate plots that the experiments in relation with its controls. Our whole setup contains the following simulations.

  • our receptor − our PTS
  • our receptor − wt PTS
  • wt receptor − our PTS
  • wt receptor − wt PTS

Nootkatone Metabolic Model

Overview

In the following we present our model of the nootkatone biosynthesis pathway. We modeled the system in order to understand how to maximize nootkatone production. We start with an oversimplified model to give you a sense for the behavior of the enzymes in the pathway. Then we continue with a model introducing a function penalizing high concentrations of toxic products. As the toxicity is one of two main culprits of nootkatone production, we further show that the production inside a peroxisome can greatly increase nootkatone yield. Afterwards we discuss possible ways to overcome the shortage of the cellular FPP pool.

Basic System

The basic reactions of the nootkatone pathway are the following:

This already implies using an alcohol dehydrogenase with similar behaviour to ADH-21 used by Sebastian Schulz (2015). This ADH on the one hand is not stereospecific, in this case meaning that it can convert both cis and trans-nootkatol. On the other hand it favours the forward reaction, thus increasing the $\frac{\text{nootkatone}}{\text{nootkatol}}$ ratio. However due to the lack of stereospecificity we modeled the pathway using only one "nootkatol" species.

We assumed Michaelis-Menten kinetics for each reaction, with the last step being reversible.

Michaelis-Menten kinetics

$$\frac{dP}{dt} = \frac{V_{Max} \cdot c_{S}}{K_{M} + c_{S}}$$

Reversible Michaelis-Menten kinetics

$$\frac{dP}{dt} = \frac{\frac{V_{M+} \cdot c_{S}}{K_{M+}} - \frac{V_{M-} \cdot c_{P}}{K_{M-}}}{1 + \frac{c_{S}}{K_{M+}} + \frac{c_{P}}{K_{M-}}}$$

We further assumed a permanent FPP production, which we fitted in the Penalty model.

This gives us the following system of differential equations.

$$\frac{dFPP}{dt} = v_{\text{FPP in}} - \frac{V_{Max,ValS} \cdot c_{FPP}}{K_{M, ValS} + c_{FPP}} $$

$$\frac{dValencene}{dt} = \frac{V_{Max,ValS} \cdot c_{FPP}}{K_{M, ValS} + c_{FPP}} -\frac{V_{Max,\text{P450 BM3}} \cdot c_{Valencene}}{K_{M, \text{P450 BM3}} + c_{Valencene}}$$

$$\frac{dNootkatol}{dt} = \frac{V_{Max,\text{P450 BM3}} \cdot c_{Valencene}}{K_{M, \text{P450 BM3}} + c_{Valencene}} - \frac{\frac{V_{M,ADH+} \cdot c_{Nootkatol}}{K_{M,ADH+}} - \frac{V_{M,ADH-} \cdot c_{Nootkatone}}{K_{M,ADH-}}}{1 + \frac{c_{Nootkatol}}{K_{M,ADH+}} + \frac{c_{Nootkatone}}{K_{M,ADH-}}}$$

$$\frac{dNootkatone}{dt} = \frac{\frac{V_{M,ADH+} \cdot c_{Nootkatol}}{K_{M,ADH+}} - \frac{V_{M,ADH-} \cdot c_{Nootkatone}}{K_{M,ADH-}}}{1 + \frac{c_{Nootkatol}}{K_{M,ADH+}} + \frac{c_{Nootkatone}}{K_{M,ADH-}}}$$

Since the main cofactor is regenerated inside the pathway we chose to not model the cofactors.

Parameters

Our crude first assumption was a constant enzyme concentration of $1 \ \text{µM}$. We mostly used kinetic parameters from the Brenda database. If S. cerevisiae was not available as the host we used the nearest relative, if the reaction was not available we used the kinetics for the respective co-substrate as an upper boundary. Another assumption we made is a five-fold reduction in the speed of the reversible reaction of the ADH-21, based on the work by Sebastian Schulz (2015), that the forward reaction is favored.

Parameter Value Source
$v_{\text{FPP in}}$ $4.18E-9 \frac{1}{\text{s}}$ Penalty model
$\text{kM}_{\text{Valencene Synthase}}$ $1.04 \text{ µM}$ Brenda
$\text{kM}_{\text{P450 BM3}}$ $126 \text{ µM}$ Brenda
$\text{kM}_{\text{ADH-21}}$ $161 \text{ µM}$ Schulz 2015
$\text{kcat}_{\text{Valencene Synthase}}$ $0.0032 \frac{1}{\text{s}}$ Brenda
$\text{kcat}_{\text{P450 BM3}}$ $2.619 \frac{1}{\text{s}}$ Brenda
$\text{kcat}_{\text{ADH-21 forward}}$ $6 \frac{1}{\text{s}}$ Sebastian Schulz (2015)
$\text{kcat}_{\text{ADH-21 backward}}$ $1.2 \frac{1}{\text{s}}$ $\frac{\text{kcat}_{\text{ADH-21 forward}}}{5}$
$c_{\text{Enzyme}}$ $1 \text{ µM}$ Assumption

Simple model

In order to simulate the system of differential equations we set up a few python files. One file containing a model class which consists of the parameters, the models and the integration function. One file containing all the kinetics used and finally one which we used to actually run the simulations. We wanted to show some example code on how to set up such a project, while the code size was still manageable, in order to help new iGEM teams that might need help with their modeling.

Kinetics File

		  
def ConstantInflux(substrate,speed):
    return(speed)

def MichaelisMenten(
        substrate,enzymeConcentration,kcat,km):
    return(
            (enzymeConcentration*kcat*substrate)
            /(km+substrate))

def MichaelisMentenReversible(
        substrate,product,enzymeConcentration,
        kcats,kcatp,kms,kmp):
    return (
            (
                    ((enzymeConcentration*kcats*substrate)/kms)
                    -((enzymeConcentration*kcatp*product)/kmp)
                    )
                    /(1+(substrate/kms)+(product/kmp)
                    ))
		  
		

Model File

		  
import kinetics as k
import numpy as np
import scipy.integrate

class model(object):
    def __init__(self):
        self.c_ValS = 1E-6 #[M]
        self.c_ADH = 1E-6 #[M]
        self.c_P450 = 1E-6 #[M]

        self.km_ValS = 1.04E-6  #[M]
        self.km_P450 = 126E-6 #[M]
        self.km_ADH = 161E-6 #[M]

        self.kcat_ValS = 0.0032*3600 #[1/s]
        self.kcat_P450 = 6*3600 #[1/s]
        self.kcat_ADH = 2.619*3600 #[1/s]
        self.k_influxSpeed = 1.03562761862e-05 #[1/s]


    def FPPInflux(self,FPP):
        return(
                k.ConstantInflux(
                        FPP,
                        self.k_influxSpeed)
                )

    def ValenceneSynthase(self,FPP):
        return(
                k.MichaelisMenten(
                        FPP,
                        self.c_ValS,
                        self.kcat_ValS,
                        self.km_ValS)
                )
    def P450(self,Valencene):
        return(
                k.MichaelisMenten(
                        Valencene,
                        self.c_P450,
                        self.kcat_P450,
                        self.km_P450)
                )
    def ADH(self,nootkatol,nootkatone):
        return (
                k.MichaelisMentenReversible(
                        nootkatol,
                        nootkatone,
                        self.c_ADH,
                        self.kcat_ADH,
                        self.kcat_ADHrev,
                        self.km_ADH,
                        self.km_ADH)
                )

    def P450_Model(self,time,initvalues):
        FPP,Val,Nkl,Nkn = initvalues
        dFPPdt = self.FPPInflux(FPP) - self.ValenceneSynthase(FPP)
        dValdt = self.ValenceneSynthase(FPP) - self.P450(Val)
        dNkldt = self.P450(Val) - self.ADH(Nkl,Nkn)
        dNtkdt = self.ADH(Nkl,Nkn)
        return [dFPPdt,dValdt,dNkldt,dNtkdt]


    def Integration(self,time,initvalues,model):
        Y = np.zeros([len(time),len(initvalues)])
        Y[0] = np.transpose(initvalues)
        r = scipy.integrate.ode(model)
        r.set_integrator('lsoda')
        r.set_initial_value(initvalues)
        cnt=1
        while r.successful() and cnt < len(time):
            Y[cnt,:] = r.integrate(time[cnt])
            cnt+=1
        return Y
		  
		

Simulation file

		  
from model import model
import numpy as np
import matplotlib.pyplot as plt

m = model()
time = np.linspace(0,1,1000)
data = m.Integration(time,[0,0,0,0],m.P450_Model)
plt.plot(time,data[:,0],label=(
        r"FPP, $k_{FPP}$ = "
        +"{:.2e}".format(m.k_influxSpeed)
        +r" $\frac{1}{h}$"
        ))
plt.plot(time,data[:,1],label=(
        r"Valencene, $c_{ValS}$ = "
        +"{:.0e}".format(m.c_ValS)
        +r" $\frac{mol}{L}$"
        ))
plt.plot(time,data[:,2],label=(
        r"nootkatol, $c_{P450}$ = "
        +"{:.0e}".format(m.c_P450)
        +r" $\frac{mol}{L}$"
        ))
plt.plot(time,data[:,3],label=(
        r"nootkatone, $c_{ADH-21}$ = "
        +"{:.0e}".format(m.c_ADH)
        +r" $\frac{mol}{L}$"
        ))
plt.yscale("log")
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=2)
plt.title("First Simulation")
plt.xlabel(r"Time in $[h]$")
plt.ylabel(r"Yield in $\frac{mol}{cell}$")
plt.show()
		  
		

The results of our first simulation are the following:

What is visible in this first simulation is that the FPP import is faster than Valencene Synthase since FPP is agglomerating. However Valencene synthase, P450 BM3 and ADH are all saturated around 20 minutes, since the valencene concentration becomes stationary and nootkatol and nootkatone levels keep rising proportionally, due to the back-reaction of ADH. This is why choosing an ADH favouring the forward reaction increases the yield while keeping the metabolic stress on the cells low.

Bioreactor model

In order to check the validity of our model we took the results of Tamara Wriessnegger 2014 as a point for reference. Wriessnegger achieved $208 \ \frac{\text{mg}}{\text{L}}$ nootkatone production after $108 \text{ 0}h$ in an in-vitro production in Pichia pastoris. We therefore had to scale up our single cell model to a bioreactor simulation.

In order to do this we first calculated the number of product molecules per cell.

$$\frac{\text{N}_{\text{Product}}}{\text{cell}} \ [\text{mol}]= c_{Product} [\frac{\text{mol}}{\text{L}}]\cdot V_{\text{Yeast}} \ [\text{L}]$$

Then we calculated the number of cells per Liter bioreactor.

$$ \text{cells per Liter} \ [\frac{\text{1}}{\text{L}}] = \frac{\rho_{\text{Cells in Bioreactor}} \ [\frac{\text{g d.w.}}{\text{L}}] }{m_{\text{Yeast dry weight}} \ [\text{g d.w.}] } $$

With these two equations we could scale up the production of a single cell for each product.

$$\text{Yield}_{\text{Bioreactor}} \ [\frac{\text{mg}}{\text{L}}] = \frac{\text{N}_{\text{Product}}}{\text{cell}} \ [\text{mol}] \cdot \text{cells per Liter} \ [\frac{\text{1}}{\text{L}}] \cdot \text{M}_{\text{Product}} \ [\frac{\text{g}}{\text{mol}}] \cdot 1000$$

We used the following constants for the calculations above.

Parameter Value Source
$\rho_{Bioreactor}$ $200 \ \frac{\text{g d.w.}}{\text{L}}$ Biomedcentral
$m_{\text{Yeast dry weight}}$ $17.5E-12 \text{ g}$ Kirschner Lab
$V_{\text{Yeast}}$ $6E-14 \text{ L}$ Philips 2012
$M_{\text{FPP}}$ $382.33 \ \frac{\text{g}}{\text{mol}}$ Pubchem
$M_{\text{Valencene}}$ $204.36 \ \frac{\text{g}}{\text{mol}}$ Pubchem
$M_{\text{Nootkatol}}$ $220.35 \ \frac{\text{g}}{\text{mol}}$ Pubchem
$M_{\text{Nootkatone}}$ $218.34 \ \frac{\text{g}}{\text{mol}}$ Pubchem

Bioreactor simulation

Our model predicts a nootkatone Yield of $154.24 \ \frac{\text{mg}}{\text{L}}$ and a nootkatol Yield of $31.06 \ \frac{\text{mg}}{\text{L}}$. Wriessnegger produced $208 \ \frac{\text{mg}}{\text{L}}$ nootkatone and $44 \ \frac{\text{mg}}{\text{L}}$ nootkatol. This showed us that our assumption of a five-fold reduction in the backwards reaction of alcohol dehydrogenase is valid, as our $\frac{\text{nootkatone}}{\text{nootkatol}}$ ratio was 4.96, which is quite close to the 4.72 by Wriessnegger. We then took a look at how differently expressing the enzymes would have an influence on nootkatone production.

Expression analysis

The first figure shows the nootkatone yield based on a varied enzyme concentration of all enzymes, while the other enzyme concentrations were kept constant at $1 \text{ µM}$. The second plot shows varied enzyme concentrations for all enzymes at the same time. The smallest concentration with almost maximal nootkatone production is at about $2 \text{ µM}$ Valencene Synthase, when not over increasing the other enzymes. Therefore we chose to double the expression of Valencene Synthase to ensure that our pathway is saturated.

This increased our yield above the results by Wriessnegger. But our model still lacks the important factor of product toxicity at high concentration, which is why we did not fit the FPP import to this, but to the next model.

Toxicity

According to Carole Gavira 2013 both nootkatone and nootkatol exhibit toxicity at high levels. At $100 \ \frac{\text{mg}}{\text{L}}$ nootkatol the cell viability was decreased to 55%, at $100 \ \frac{\text{mg}}{\text{L}}$ nootkatone to 80%. For both substances all cells were dead around $1000 \ \frac{\text{mg}}{\text{L}}$. We used hill kinetics to fit this behaviour.

$$\frac{K_{Tox}^n}{K_{Tox}^n + [S]^n}$$

We fitted the data of Gavira using scipy.optimize.brent and multiple factors n. This is due to the fact that n is only valid in integer values and scipy does not support mixed-integer optimizations. We further chose to only implement nootkatone toxicity, even if it is the less toxic compound, as it is concentrated approximately five times higher than nootkatol in the cell. This also helped to keep our model simple.

$\text{n}$ $K_{\text{Toxic nootkatol}}$ Error
2 200 0.136
3 158.74 0.104
4 141.42 0.100
5 131.95 0.100
6 125.99 0.100

Even though the error produced by functions with a higher n was decreasing we chose to use n = 3 to prevent overfitting, since n = 3 already fit both points quite well.

We used this penalty function only on our FPP import. The idea behind that was that since the whole pathway depends linearly on the FPP influx it can act representative for the cell viability. This turned out to show some problems later in the model but we chose to keep this assumption, as the problems can be addressed by proper interpretation of the data and because we wanted to keep our model simple.

This changes our system in the following way.

$$\frac{dFPP}{dt} = v_{\text{FPP in}} \cdot \frac{K_{\text{Tox Nkn}}^3}{K_{\text{Tox Nkn}}^3 + [\text{Nkn}]^3} - \frac{V_{Max,ValS} \cdot c_{FPP}}{K_{M, ValS} + c_{FPP}} $$

Penalty model

Dynamics

This is the model where we actually fitted our import parameter to the data supplied by Wriessnegger, as we assumed it to be sufficiently close to the actual biological reality. We did this by using scipy.optimize.brent to minimize the error function $\text{abs}(208 - [S]_{\text{nootkatol}})$. This gave us an import rate of $6.55E-9 \frac{1}{\text{s}}$, which we then also used in all other models, including the ones shown previously.

The nootkatone yield is approximately at the results of Wriessnegger and we can see that the FPP import declines with increasing nootkatone production. The penalty function can be interpreted as the fraction of alive cells, which at the final point of the simulation is $52.16 \ \%$. This huge number already gives us an understanding that nootkatol and nootkatone production limit themselves in cytosolic production, thus showing us the first culprit of nootkatone production.

Expression

We wanted to check whether over- or underexpression of any of the enzymes could improve our nootkatone yield under penalty conditions.

Interestingly the peak nootkatone yield is now under submaximal Valencene Synthase expression. This is possibly due to the increased toxicity. In order to check this we further increased the FPP import tenfold over the fitted value.

Increased FPP import dynamics

The maximal yield under increased FPP import conditions is increased, but at the end of the simulation only around $11 \ \%$ of the cell population is alive. This is the problem of the model that we talked earlier about. In this case the early FPP import so fast compared to the enzymatic reactions so that FPP accumulates. Since only the import is penalized, but not the other reactions, all that follows is an simulation of dead cells still transforming an large FPP pool. This is obviously biologically wrong and an artifact of the way the penalty is set up. The effects of this error can be seen in the variation of single enzyme concentrations in the plots below.

Increased FPP import expression

Increased FPP import and enzyme concentration dynamics

Even with this submaximal production only $4 \ \%$ of cells are alive at the end of the simulation. So even if the results of the production with a tenfold increased FPP influx have to be interpreted carefully, the general takeaway of the simulation is that given a sufficiently high FPP influx, nootkatone toxicity is the main remaining culprit of nootkatone production. This is where our nootkatone subproject comes in. By transferring the enzymes needed for nootkatone production into the peroxisome we assume that the products cannot diffuse back into the cell and therefore do not impose any toxicity on the cells. This can be seen by the great increase in nootkatone production in the models below.

Peroxisome model

Since peroxisomes vary greatly in size we first assumed them to be as large as the yeast cells. While this assumption is clearly wrong, we later show the minimal size needed under specific conditions to outperform cytosolic production. Since we assume no nootkatone and nootkatol toxicity inside the peroxisome, the general reaction dynamics are equal to the simple bioreactor model, but with overexpressed Valencene Synthase. We first modeled the maximal nootkatone yield with varying enzyme concentrations.

Expression

What can be seen here as before is that the maximal nootkatone yield is determined by the FPP import and that Valencene Synthase is the velocity dictating enzyme. Even at a high expression of Valencene Synthase increasing either P450 BM3 or ADH over $1E-8 \ \frac{\text{mol}}{\text{L}}$ does not further increase the nootkatone yield, as the two enzymes are faster than Valencene Synthase. To further increase nootkatone yield the FPP import has to be increased. In this case the nootkatone yield can be increased linearly, as shown below.

Increased FPP import dynamics

Increased FPP import expression

Increased FPP increased enzyme concentration dynamics

Conclusion

Model Nootkatone Nootkatol
Bioreactor 154.24* 31.06*
Bioreactor ValS Up 302.87* 61.36*
Penalty 206.10 41.40
Penalty FPP Up 309.62** 62.73**
Penalty FPP + Enzymes Up 456.93** 91.43**
Peroxisome Enzymes Up 317.83 64.65
Peroxisome FPP Up 309.63 62.73
Peroxisome FPP + Enzymes Up 3082.86 624.14

As can be seen from our simulations, the two main culprits of nootkatone production are the toxicity of the products at high concentrations and the FPP influx. The toxicity can be avoided by transferring the pathway into the peroxisome, which is one of the projects of our team. To address the problem of FPP influx our team did an opt-knock FBA analysis to show how S. cerevisiae cells can be engineered to stably produce more FPP. Note that the values marked with an asterisk were produced by an incomplete model and values marked with two asterisks are assumed to be false due to the model setup as explained in the sections above.

Further thoughts

A false assumption in the peroxisomal models above is that the size of the peroxisome is equal to that of the yeast cell. This is obviously not correct. We wanted to get a feeling how large the peroxisome actually has to be in order for production to be superior to cytosolic production. We therefore took the the results of the cytosolic model with increased FPP intake and enzyme concentration and then calculated the yield of the corresponding peroxisomal model with varying peroxisome size.

We calculated the volume for equal production

$$V_{\text{Perox. Equal}} = \frac{\text{Yield}_{\text{Cyt.}} \cdot V_{\text{Yeast}}}{\text{Yield}_{\text{Perox.}}}$$

to be $1.24E-14 \ \text{L}$, so approximately a fifth of the cell volume. Since in methylotrophic yeasts such as Pichia pastoris peroxisomes can take up to 80% of the total cell volume (Gleeson ), a fifth of the cell volume is very achievable. To further increase the possible yield our team is working on a Pex11-knockout that aims to increase peroxisome size

OptKnock Analysis

Background

Constraint based modeling

Constraint based modeling is a mechanistically method to simulate complex metabolic networks under the critical assumption that the network is in a steady state. It requires a reconstructed metabolic network, in which all reactions have been characterized stoichiometrically. The network is then represented by the stoichiometric matrix $S$, in which each column corresponds to a reaction and each row to a metabolite. The reactions are further constrained only allowing positive fluxes $v_i \geq 0$ (elementary modes) and by putting upper and lower bounds on the flux (turnover rate). This reduces the solution space into a convex polyhedral cone, in which every point is a non-negative combination of elementary modes and every edge a unique elementary mode. As metabolic networks typically have more reactions than metabolites, the systems are underdetermined and therefore the flux distribution is not definitive. To find the actual biological solution additional constraints have to be added, for example an objective function that is to be optimized.

Flux Balance Analysis

A sensible objective function should in our case describe the main driver behind evolution - natural selection. This can be accomplished by defining a biomass production term $Z = c^Tv$, which consists of several biomass precursor reactions merged into one unit. Therefore the optimization finds the flux distribution with which maximal biomass production is accomplished, which is in accordance to natural selection.

In matrix formalism this can be written as $Sv = 0$, where $S$ is the stoichiometric matrix and $v$ the vector of fluxes that is to be solved.$c$ is the vector that defines the weights of each flux in the objective function. The canonical form of a Flux Balance Analysis thus can be written as

$$\text{maximize } c^Tv$$

$$\text{subject to } Sv = 0$$

$$\text{and lower bound } \leq v \leq \text{upper bound}$$

This is a linear optimization which due to the convex solution space can be calculated efficiently with linear programming.

OptKnock

Metabolic engineering faces two big problems. Firstly the precursor molecule for the wanted reaction can only be found in trace amounts in the cell, leading to minor yields of the wanted product. Secondly adding pathways to the metabolism of an organism increases the metabolic stress on that organism. Thus if the cell loses the ability to produce the desired compound it can overgrow the other cells in a large-scale bioreactor setting. OptKnock (Anthony P. Burgard 2003) is a two step optimization in which biomass production is coupled to the production of a desired product by calculating a set of gene knockouts, solving both problems above.In principle the algorithm works like this:

$$ \left[ \begin{array}{ll} \text{Maximize Product} \\ \text{(over knockouts)} \\ \text{Subject to} & \left[ \begin{array}{ll} \text{Maximize biomass} \\ \text{(over fluxes)} \\ \text{Subject to} & \left[ \begin{array}{l} \text{Fixed substrate uptake rate} \\ \text{Network Stoichiometry} \\ \text{Knockouts identified by outer problem} \\ \end{array} \right] \\ \end{array} \right] \\ \text{Minimum biomass yield} \\ \text{N}_{\text{Knockouts}} \leq \text{ limit} \end{array} \right] $$

We visualized the concept in the following two pictures. The important point here is to notice that the maximal biomass flux in the OptKnock model at the same time corresponds to a high product flux.

Results

We analyzed the FPP pool by using Geranyltranstransferase (GRTT) as the reaction to be optimized. GRTT takes geranyl diphosphate and isopentenyl diphosphate and converts them to farnesyl pyrophosphate and diphosphate. The two results below show the maximal GRTT production with a minimal biomass yield of 0.1 and 0.2 respectively.

Compared to the wild type FPP flux (0.0013) the maximal fluxes that could be achieved with four knockouts were 0.61 and 0.29, respective to the minimal biomass yield of 0.1 and 0.2. This roughly corresponds to a 480- and 230-fold increase in FPP flux. While we do not think that these results are close to the actual biological truth, as a 230- or 480-fold increase is rather extreme, we did notice an accumulation of knockouts in genes related to the pentose phosphate pathway (TKT2 = Transketolase, G6PDH2r = Glucose-6-phosphate dehydrogenase) or citric acid cycle (ALATA_L = L-alanine transaminase). Further reactions like ADK3 (Adentylate kinase (GTP)), which is indirectly involved in the TCA by supplying GDP or IPPSm (2 isopropylmalate synthase mitochondrial) that takes acetyl-CoA as a substrate lead us to the conclusion that the cellular FPP pool can be increased by shifting to a fatty acid metabolism. This is further solidified by the fact that the FPP precursor isopentenyl diphosphate ultimately is build from acetyl-CoA Sebastián Rubat 2017.