Team:Cologne-Duesseldorf/Model

Model

Summary

To enable full control over the peroxisomal protein import we predicted the structure for the yeast peroxisomal matrix import protein. In this structure we exchanged amino acids in the TPR regions using PyMol to create an orthogonal binding pocket. Using AMBER and VMD we predicted a binding pocket which binds a non wild type but not the wild type targeting signal.
To optimize our proof-of-concept project we modeled the Nootkatone pathway using python and showed the advantages of cellular compartmentalization. Finally we were able to create a yeast strain optimized for the production of the pathway precursor by using OptFlux analysis.

Infrastructure and modelling software

For the targeted mutagenesis of PEX5 we used molecular dynamic techniques. Professor Gohlke of the University of Duesseldorf kindly provided us with the necessary infrastructure (e.g. server and software access) and advisors to perform our calculations.
Molecular dynamics are a tool simulate the cells environment, regarding its temperature and ion contents, to calculate protein−protein interactions. We simulated the movement of our peroxisomal targeting signal (PTS) within the binding pocket of the PEX5 receptor.
The molecular dynamics software AMBER (Assisted Model Building with Energy Refinement) uses so called force fields to calculate the movement of each atom. For that we start with an equilibrated state which means that the PTS and every other molecule sit at the most favorable position. After equilibration we proceed with the actual simulation. The simulations only differ in the temperature of the environment which is more or less 300 K in each simulation. The whole workflow is depicted in the following graphic.

Workflow for the molecular dynamics simulations

As one can see, we start with a targeted mutagenesis for our PEX5 receptor, then proceed with the actual simulation to get finally to data evaluation. This is the point where we either decide to reject this receptor variant or to let it synthesize for experiments in the lab.

Three dimensional structure of the yeast PEX5 receptor

Currently there is no crystal structure of the yeasts PEX5 and due to that we had to predict its structure. To achieve that, we used a program provided by Daniel Mulnaes that uses existing files of the pdb database to create the structural prediction of the input sequence. One of those is the crystal structure of the TPR domains of the human PEX5 receptor interacting with the PTS1 ( PDB-entry: 1FCH ). Once we got this pdb file, we could proceed with our targeted mutagenesis in PyMOL.
Unfortunately, the program could not predict the whole structure, but only about 51% of it. It contains the important TPR domains and is very similar to the human crystal structure as you see in the figure below.

Alignment of the pdb file 1FCH (gray) to the predicted structure of the yeasts PEX5 (red).

PyMOLs 'mutagenesis wizard'

PyMOL provides a built-in function called 'mutagenesis wizard' making it possible to introduce site-directed mutations in the 3D structure. With this tool we exchanged amino acids in the TPR regions to create beneficial interactions between receptor and targeting signal. Our new receptor should only recognize our new peptide and not the wild type PTS1 and in exchange the wild type receptor should not recognize the new designed PTS.

TPR domains of the yeasts PEX5 receptor with the PTS1. The yellow lines indicate interacting amino acids.

Calculated vacuum electrostatics of the PEX5 receptor with the PTS. Red indicates positive, blue negative polarity.


The two figures above show the visualization possibilities in PyMOL − the upper one shows interacting amino acids while the lower one depicts the overall polarity. Both features helped us to set mutations in an educated way. As soon as we designed our receptor-PTS combination, we proceeded with the preparation for the molecular dynamics simulation.

Equilibration and molecular dynamics simulation

Once the mutations were set and the peptide was build, we equilibrated and simulated our experiment. After a couple of days the simulations were finished and we could start with the data evaluation.

Graphical visualization

VMD (Visual Molecular Dynamics) uses the coordinate files generated by AMBER to illustrate the simulation in an animated way as it can be seen below.

This already looked neat and gabe a rough hint on how promising the results are but for scientific evaluation we proceeded with numerical analysis.

Diffusion

To get numerical data out of the coordinate files, we then calculated the diffusion of the PTS from its starting point in the equilibrated state. A low value indicates stable binding while a high value indicates unfavorable interactions. Hence we had five replicates for each 'experiment', we calculated an average value and the standard deviation of all five replicates for point in time. Due to fluctuating values, we decided to implement a triangular smoothing algorithm to get more representative values. After the calculation we used R's ggplot2 to generate plots that set the experiments in relation with its controls. Our whole setup contains the following simulations.


  • our receptor − our PT
  • our receptor − wild type PTS
  • wt receptor − our PTS
  • wt receptor − wild type PTS

The following plots show examplaric how our results look like.

PEX5 variant R3 with the PTS1 variant -YDSYD. This shows an example for a unfavorable result − the receptor prefers the PTS1 above the PTS1*.

The figure above shows the one of our plots generated from the simulations data. We calculated the diffusion of the PTS1 from its starting point in angstrom and plotted it against the time in nanoseconds. The lines show the average diffusion value for each experiment while the shaded area depicts the standard deviation of the different replicates. Furthermore we calculated a value which indicates if the PEX5 variant has a higher preference for the PTS1* or the PTS1. A positive value implies a higher affinity while a negative value implies the opposite − one has to bear in mind that negative values reach higher numbers than positive ones. This is caused by the simulations conditions because if the binding is unstable, the targeting signal moves a lot.
To return to the figure, the red line represents the PEX5 variant with the PTS1 and the blue with the PTS1*. The figure above is an example for an unfavorable result − one sees that wild type signal is prefered above the designed one and thereby this would not lead to the achievement of our goal.

PEX5 variant R8 with the PTS1 variant -YTNQE. This shows an example for a quite favorable result. The preference coefficient is positive which means that the receptor prefers the PTS1* above the PTS1.

This figure shows a more favorable result. On average the PEX5 variant has a slightly higher preference for the PTS1* and due to that we figured this one would be worth a test in our laboratory.

PEX5 variant R15 with the PTS1 variant -YTNWD. The results of this simulation were also quite positive hence the receptor prefers the PTS1* above the natural PTS1. Unfortunately, the standard deviation is nearly as high as the preference coefficient, which means that this result should be considered with skepticism. Nevertheless we decided for us for laboratory tests.

Experimental results

PEX5 variant R19

This PEX5 variant did not directly develop from molecular dynamics in the first place. Baker et al. worked on a similar project and obtained results that we took in consideration for the design of the PEX5 variant R19 (see here for more detailed information).

PEX5 variant R19 with to PTS1* variants − the left figure shows the result of the full 15 amino acid long PTS1* while the right one was only done with the last five amino acids.

The figure above shows the result of the molecular dynamics simulation with our PEX5 variant R19. As one can see, the simulation with the 15 amino acid long peptide shows very unstable behaviour. Since our assumption for this was the long peptide and missing structures in the PEX5 model, we tried again with only five amino acids and obtained better results.
The very interesting part is, that this receptor was tested positively in our experiment − it successfully imported mTurquoise with the corresponding PTS1* (visit our results section for more information). Since the value of the preference coefficient is quite low, this means that our molecular dynamics approach was condemned to failure.

Review: molecular dynamics

Molecular dynamics are definitely an interesting and promising instrument regarding protein design and synthetic biology but our experiments showed the obstacles that come up once you work with it. As described before we achieved a positive result with one of our PEX5 variants but this very combination of PEX5 variant and PTS1* showed undesirable results in the simulation.
We think that the predicted PEX5 of yeast is the cause of this behaviour. The structure could only be predicted to about 51% which means that there could be more structures involved in PTS1 binding than presently assumed. In the beginning we assumed that the TPR regions would be enough, but now we think that we would need the whole crystal structure to do successful protein engineering from scratch. Unfortunately there are currently no crystal structures of the whole protein − neither of the yeast nor of any other organism − and due to its size it is unlikely to be crystallised in the near future.

Overview

In the following we present our model of the nootkatone biosynthesis pathway. We modeled the system in order to understand how to maximize nootkatone production. We start with an oversimplified model to give you a sense for the behavior of the enzymes in the pathway. Then we continue with a model introducing a function penalizing high concentrations of toxic products. As the toxicity is one of two main culprits of nootkatone production, we further show that the production inside a peroxisome can greatly increase nootkatone yield. Afterwards we discuss possible ways to overcome the shortage of the cellular FPP pool.

Basic System

The basic reactions of the nootkatone pathway are the following:

This already implies using an alcohol dehydrogenase with similar behaviour to ADH-21 used by Sebastian Schulz (2015). This ADH on the one hand is not stereospecific, in this case meaning that it can convert both cis and trans-nootkatol. On the other hand it favours the forward reaction, thus increasing the $\frac{\text{nootkatone}}{\text{nootkatol}}$ ratio. However due to the lack of stereospecificity we modeled the pathway using only one "nootkatol" species.

We assumed Michaelis-Menten kinetics for each reaction, with the last step being reversible.

Michaelis-Menten kinetics

$$\frac{dP}{dt} = \frac{V_{Max} \cdot c_{S}}{K_{M} + c_{S}}$$

Reversible Michaelis-Menten kinetics

$$\frac{dP}{dt} = \frac{\frac{V_{M+} \cdot c_{S}}{K_{M+}} - \frac{V_{M-} \cdot c_{P}}{K_{M-}}}{1 + \frac{c_{S}}{K_{M+}} + \frac{c_{P}}{K_{M-}}}$$

We further assumed a permanent FPP production, which we fitted in the Penalty model.

This gives us the following system of differential equations.

$$\frac{dFPP}{dt} = v_{\text{FPP in}} - \frac{V_{Max,ValS} \cdot c_{FPP}}{K_{M, ValS} + c_{FPP}} $$

$$\frac{dValencene}{dt} = \frac{V_{Max,ValS} \cdot c_{FPP}}{K_{M, ValS} + c_{FPP}} -\frac{V_{Max,\text{P450 BM3}} \cdot c_{Valencene}}{K_{M, \text{P450 BM3}} + c_{Valencene}}$$

$$\frac{dNootkatol}{dt} = \frac{V_{Max,\text{P450 BM3}} \cdot c_{Valencene}}{K_{M, \text{P450 BM3}} + c_{Valencene}} - \frac{\frac{V_{M,ADH+} \cdot c_{Nootkatol}}{K_{M,ADH+}} - \frac{V_{M,ADH-} \cdot c_{Nootkatone}}{K_{M,ADH-}}}{1 + \frac{c_{Nootkatol}}{K_{M,ADH+}} + \frac{c_{Nootkatone}}{K_{M,ADH-}}}$$

$$\frac{dNootkatone}{dt} = \frac{\frac{V_{M,ADH+} \cdot c_{Nootkatol}}{K_{M,ADH+}} - \frac{V_{M,ADH-} \cdot c_{Nootkatone}}{K_{M,ADH-}}}{1 + \frac{c_{Nootkatol}}{K_{M,ADH+}} + \frac{c_{Nootkatone}}{K_{M,ADH-}}}$$

Since the main cofactor is regenerated inside the pathway we chose to not model the cofactors.

Parameters

Our crude first assumption was a constant enzyme concentration of $1 \ \text{µM}$. We mostly used kinetic parameters from the Brenda database. If S. cerevisiae was not available as the host we used the nearest relative, if the reaction was not available we used the kinetics for the respective co-substrate as an upper boundary. Another assumption we made is a five-fold reduction in the speed of the reversible reaction of the ADH-21, based on the work by Sebastian Schulz (2015), that the forward reaction is favored.

Parameter Value Source
$v_{\text{FPP in}}$ $4.18E-9 \frac{1}{\text{s}}$ Penalty model
$\text{kM}_{\text{Valencene Synthase}}$ $1.04 \text{ µM}$ Brenda
$\text{kM}_{\text{P450 BM3}}$ $126 \text{ µM}$ Brenda
$\text{kM}_{\text{ADH-21}}$ $161 \text{ µM}$ Schulz 2015
$\text{kcat}_{\text{Valencene Synthase}}$ $0.0032 \frac{1}{\text{s}}$ Brenda
$\text{kcat}_{\text{P450 BM3}}$ $2.619 \frac{1}{\text{s}}$ Brenda
$\text{kcat}_{\text{ADH-21 forward}}$ $6 \frac{1}{\text{s}}$ Sebastian Schulz (2015)
$\text{kcat}_{\text{ADH-21 backward}}$ $1.2 \frac{1}{\text{s}}$ $\frac{\text{kcat}_{\text{ADH-21 forward}}}{5}$
$c_{\text{Enzyme}}$ $1 \text{ µM}$ Assumption

Simple model

In order to simulate the system of differential equations we set up a few python files. One file containing a model class which consists of the parameters, the models and the integration function. One file containing all the kinetics used and finally one which we used to actually run the simulations. We wanted to show some example code on how to set up such a project, while the code size was still manageable, in order to help new iGEM teams that might need help with their modeling.

Kinetics File

				
def ConstantInflux(substrate,speed):
return(speed)

def MichaelisMenten(
	substrate,enzymeConcentration,kcat,km):
return(
			(enzymeConcentration*kcat*substrate)
			/(km+substrate))

def MichaelisMentenReversible(
	substrate,product,enzymeConcentration,
	kcats,kcatp,kms,kmp):
return (
			(
							((enzymeConcentration*kcats*substrate)/kms)
							-((enzymeConcentration*kcatp*product)/kmp)
							)
							/(1+(substrate/kms)+(product/kmp)
							))
				
				

Model File

				
import kinetics as k
import numpy as np
import scipy.integrate

class model(object):
def __init__(self):
	self.c_ValS = 1E-6 #[M]
	self.c_ADH = 1E-6 #[M]
	self.c_P450 = 1E-6 #[M]

	self.km_ValS = 1.04E-6  #[M]
	self.km_P450 = 126E-6 #[M]
	self.km_ADH = 161E-6 #[M]

	self.kcat_ValS = 0.0032*3600 #[1/s]
	self.kcat_P450 = 6*3600 #[1/s]
	self.kcat_ADH = 2.619*3600 #[1/s]
	self.k_influxSpeed = 1.03562761862e-05 #[1/s]


def FPPInflux(self,FPP):
	return(
					k.ConstantInflux(
									FPP,
									self.k_influxSpeed)
					)

def ValenceneSynthase(self,FPP):
	return(
					k.MichaelisMenten(
									FPP,
									self.c_ValS,
									self.kcat_ValS,
									self.km_ValS)
					)
def P450(self,Valencene):
	return(
					k.MichaelisMenten(
									Valencene,
									self.c_P450,
									self.kcat_P450,
									self.km_P450)
					)
def ADH(self,nootkatol,nootkatone):
	return (
					k.MichaelisMentenReversible(
									nootkatol,
									nootkatone,
									self.c_ADH,
									self.kcat_ADH,
									self.kcat_ADHrev,
									self.km_ADH,
									self.km_ADH)
					)

def P450_Model(self,time,initvalues):
	FPP,Val,Nkl,Nkn = initvalues
	dFPPdt = self.FPPInflux(FPP) - self.ValenceneSynthase(FPP)
	dValdt = self.ValenceneSynthase(FPP) - self.P450(Val)
	dNkldt = self.P450(Val) - self.ADH(Nkl,Nkn)
	dNtkdt = self.ADH(Nkl,Nkn)
	return [dFPPdt,dValdt,dNkldt,dNtkdt]


def Integration(self,time,initvalues,model):
	Y = np.zeros([len(time),len(initvalues)])
	Y[0] = np.transpose(initvalues)
	r = scipy.integrate.ode(model)
	r.set_integrator('lsoda')
	r.set_initial_value(initvalues)
	cnt=1
	while r.successful() and cnt < len(time):
			Y[cnt,:] = r.integrate(time[cnt])
			cnt+=1
	return Y
				
				

Simulation file

				
from model import model
import numpy as np
import matplotlib.pyplot as plt

m = model()
time = np.linspace(0,1,1000)
data = m.Integration(time,[0,0,0,0],m.P450_Model)
plt.plot(time,data[:,0],label=(
	r"FPP, $k_{FPP}$ = "
	+"{:.2e}".format(m.k_influxSpeed)
	+r" $\frac{1}{h}$"
	))
plt.plot(time,data[:,1],label=(
	r"Valencene, $c_{ValS}$ = "
	+"{:.0e}".format(m.c_ValS)
	+r" $\frac{mol}{L}$"
	))
plt.plot(time,data[:,2],label=(
	r"nootkatol, $c_{P450}$ = "
	+"{:.0e}".format(m.c_P450)
	+r" $\frac{mol}{L}$"
	))
plt.plot(time,data[:,3],label=(
	r"nootkatone, $c_{ADH-21}$ = "
	+"{:.0e}".format(m.c_ADH)
	+r" $\frac{mol}{L}$"
	))
plt.yscale("log")
plt.legend(loc=9, bbox_to_anchor=(0.5, -0.2), ncol=2)
plt.title("First Simulation")
plt.xlabel(r"Time in $[h]$")
plt.ylabel(r"Yield in $\frac{mol}{cell}$")
plt.show()
				
				

The results of our first simulation are the following:

What is visible in this first simulation is that the FPP import is faster than Valencene Synthase since FPP is agglomerating. However Valencene synthase, P450 BM3 and ADH are all saturated around 20 minutes, since the valencene concentration becomes stationary and nootkatol and nootkatone levels keep rising proportionally, due to the back-reaction of ADH. This is why choosing an ADH favouring the forward reaction increases the yield while keeping the metabolic stress on the cells low.

Bioreactor model

In order to check the validity of our model we took the results of Tamara Wriessnegger 2014 as a point for reference. Wriessnegger achieved $208 \ \frac{\text{mg}}{\text{L}}$ nootkatone production after $108 \text{ 0}h$ in an in-vitro production in Pichia pastoris. We therefore had to scale up our single cell model to a bioreactor simulation.

In order to do this we first calculated the number of product molecules per cell.

$$\frac{\text{N}_{\text{Product}}}{\text{cell}} \ [\text{mol}]= c_{Product} [\frac{\text{mol}}{\text{L}}]\cdot V_{\text{Yeast}} \ [\text{L}]$$

Then we calculated the number of cells per Liter bioreactor.

$$ \text{cells per Liter} \ [\frac{\text{1}}{\text{L}}] = \frac{\rho_{\text{Cells in Bioreactor}} \ [\frac{\text{g d.w.}}{\text{L}}] }{m_{\text{Yeast dry weight}} \ [\text{g d.w.}] } $$

With these two equations we could scale up the production of a single cell for each product.

$$\text{Yield}_{\text{Bioreactor}} \ [\frac{\text{mg}}{\text{L}}] = \frac{\text{N}_{\text{Product}}}{\text{cell}} \ [\text{mol}] \cdot \text{cells per Liter} \ [\frac{\text{1}}{\text{L}}] \cdot \text{M}_{\text{Product}} \ [\frac{\text{g}}{\text{mol}}] \cdot 1000$$

We used the following constants for the calculations above.

Parameter Value Source
$\rho_{Bioreactor}$ $200 \ \frac{\text{g d.w.}}{\text{L}}$ Biomedcentral
$m_{\text{Yeast dry weight}}$ $17.5E-12 \text{ g}$ Kirschner Lab
$V_{\text{Yeast}}$ $6E-14 \text{ L}$ Philips 2012
$M_{\text{FPP}}$ $382.33 \ \frac{\text{g}}{\text{mol}}$ Pubchem
$M_{\text{Valencene}}$ $204.36 \ \frac{\text{g}}{\text{mol}}$ Pubchem
$M_{\text{Nootkatol}}$ $220.35 \ \frac{\text{g}}{\text{mol}}$ Pubchem
$M_{\text{Nootkatone}}$ $218.34 \ \frac{\text{g}}{\text{mol}}$ Pubchem

Bioreactor simulation

Our model predicts a nootkatone Yield of $154.24 \ \frac{\text{mg}}{\text{L}}$ and a nootkatol Yield of $31.06 \ \frac{\text{mg}}{\text{L}}$. Wriessnegger produced $208 \ \frac{\text{mg}}{\text{L}}$ nootkatone and $44 \ \frac{\text{mg}}{\text{L}}$ nootkatol. This showed us that our assumption of a five-fold reduction in the backwards reaction of alcohol dehydrogenase is valid, as our $\frac{\text{nootkatone}}{\text{nootkatol}}$ ratio was 4.96, which is quite close to the 4.72 by Wriessnegger. We then took a look at how differently expressing the enzymes would have an influence on nootkatone production.

Expression analysis

The first figure shows the nootkatone yield based on a varied enzyme concentration of all enzymes, while the other enzyme concentrations were kept constant at $1 \text{ µM}$. The second plot shows varied enzyme concentrations for all enzymes at the same time. The smallest concentration with almost maximal nootkatone production is at about $2 \text{ µM}$ Valencene Synthase, when not over increasing the other enzymes. Therefore we chose to double the expression of Valencene Synthase to ensure that our pathway is saturated.

This increased our yield above the results by Wriessnegger. But our model still lacks the important factor of product toxicity at high concentration, which is why we did not fit the FPP import to this, but to the next model.

Toxicity

According to Carole Gavira 2013 both nootkatone and nootkatol exhibit toxicity at high levels. At $100 \ \frac{\text{mg}}{\text{L}}$ nootkatol the cell viability was decreased to 55%, at $100 \ \frac{\text{mg}}{\text{L}}$ nootkatone to 80%. For both substances all cells were dead around $1000 \ \frac{\text{mg}}{\text{L}}$. We used hill kinetics to fit this behaviour.

$$\frac{K_{Tox}^n}{K_{Tox}^n + [S]^n}$$

We fitted the data of Gavira using scipy.optimize.brent and multiple factors n. This is due to the fact that n is only valid in integer values and scipy does not support mixed-integer optimizations. We further chose to only implement nootkatone toxicity, even if it is the less toxic compound, as it is concentrated approximately five times higher than nootkatol in the cell. This also helped to keep our model simple.

$\text{n}$ $K_{\text{Toxic nootkatol}}$ Error
2 200 0.136
3 158.74 0.104
4 141.42 0.100
5 131.95 0.100
6 125.99 0.100

Even though the error produced by functions with a higher n was decreasing we chose to use n = 3 to prevent overfitting, since n = 3 already fit both points quite well.

We used this penalty function only on our FPP import. The idea behind that was that since the whole pathway depends linearly on the FPP influx it can act representative for the cell viability. This turned out to show some problems later in the model but we chose to keep this assumption, as the problems can be addressed by proper interpretation of the data and because we wanted to keep our model simple.

This changes our system in the following way.

$$\frac{dFPP}{dt} = v_{\text{FPP in}} \cdot \frac{K_{\text{Tox Nkn}}^3}{K_{\text{Tox Nkn}}^3 + [\text{Nkn}]^3} - \frac{V_{Max,ValS} \cdot c_{FPP}}{K_{M, ValS} + c_{FPP}} $$

Penalty model

Dynamics

This is the model where we actually fitted our import parameter to the data supplied by Wriessnegger, as we assumed it to be sufficiently close to the actual biological reality. We did this by using scipy.optimize.brent to minimize the error function $\text{abs}(208 - [S]_{\text{nootkatol}})$. This gave us an import rate of $6.55E-9 \frac{1}{\text{s}}$, which we then also used in all other models, including the ones shown previously.

The nootkatone yield is approximately at the results of Wriessnegger and we can see that the FPP import declines with increasing nootkatone production. The penalty function can be interpreted as the fraction of alive cells, which at the final point of the simulation is $52.16 \ \%$. This huge number already gives us an understanding that nootkatol and nootkatone production limit themselves in cytosolic production, thus showing us the first culprit of nootkatone production.

Expression

We wanted to check whether over- or underexpression of any of the enzymes could improve our nootkatone yield under penalty conditions.

Interestingly the peak nootkatone yield is now under submaximal Valencene Synthase expression. This is possibly due to the increased toxicity. In order to check this we further increased the FPP import tenfold over the fitted value.

Increased FPP import dynamics

The maximal yield under increased FPP import conditions is increased, but at the end of the simulation only around $11 \ \%$ of the cell population is alive. This is the problem of the model that we talked earlier about. In this case the early FPP import so fast compared to the enzymatic reactions so that FPP accumulates. Since only the import is penalized, but not the other reactions, all that follows is an simulation of dead cells still transforming an large FPP pool. This is obviously biologically wrong and an artifact of the way the penalty is set up. The effects of this error can be seen in the variation of single enzyme concentrations in the plots below.

Increased FPP import expression

Increased FPP import and enzyme concentration dynamics

Even with this submaximal production only $4 \ \%$ of cells are alive at the end of the simulation. So even if the results of the production with a tenfold increased FPP influx have to be interpreted carefully, the general takeaway of the simulation is that given a sufficiently high FPP influx, nootkatone toxicity is the main remaining culprit of nootkatone production. This is where our nootkatone subproject comes in. By transferring the enzymes needed for nootkatone production into the peroxisome we assume that the products cannot diffuse back into the cell and therefore do not impose any toxicity on the cells. This can be seen by the great increase in nootkatone production in the models below.

Peroxisome model

Since peroxisomes vary greatly in size we first assumed them to be as large as the yeast cells. While this assumption is clearly wrong, we later show the minimal size needed under specific conditions to outperform cytosolic production. Since we assume no nootkatone and nootkatol toxicity inside the peroxisome, the general reaction dynamics are equal to the simple bioreactor model, but with overexpressed Valencene Synthase. We first modeled the maximal nootkatone yield with varying enzyme concentrations.

Expression

What can be seen here as before is that the maximal nootkatone yield is determined by the FPP import and that Valencene Synthase is the velocity dictating enzyme. Even at a high expression of Valencene Synthase increasing either P450 BM3 or ADH over $1E-8 \ \frac{\text{mol}}{\text{L}}$ does not further increase the nootkatone yield, as the two enzymes are faster than Valencene Synthase. To further increase nootkatone yield the FPP import has to be increased. In this case the nootkatone yield can be increased linearly, as shown below.

Increased FPP import dynamics

Increased FPP import expression

Increased FPP increased enzyme concentration dynamics

Conclusion

Model Nootkatone Nootkatol
Bioreactor 154.24* 31.06*
Bioreactor ValS Up 302.87* 61.36*
Penalty 206.10 41.40
Penalty FPP Up 309.62** 62.73**
Penalty FPP + Enzymes Up 456.93** 91.43**
Peroxisome Enzymes Up 317.83 64.65
Peroxisome FPP Up 309.63 62.73
Peroxisome FPP + Enzymes Up 3082.86 624.14

As can be seen from our simulations, the two main culprits of nootkatone production are the toxicity of the products at high concentrations and the FPP influx. The toxicity can be avoided by transferring the pathway into the peroxisome, which is one of the projects of our team. To address the problem of FPP influx our team did an opt-knock FBA analysis to show how S. cerevisiae cells can be engineered to stably produce more FPP. Note that the values marked with an asterisk were produced by an incomplete model and values marked with two asterisks are assumed to be false due to the model setup as explained in the sections above.

Further thoughts

A false assumption in the peroxisomal models above is that the size of the peroxisome is equal to that of the yeast cell. This is obviously not correct. We wanted to get a feeling how large the peroxisome actually has to be in order for production to be superior to cytosolic production. We therefore took the the results of the cytosolic model with increased FPP intake and enzyme concentration and then calculated the yield of the corresponding peroxisomal model with varying peroxisome size.

We calculated the volume for equal production

$$V_{\text{Perox. Equal}} = \frac{\text{Yield}_{\text{Cyt.}} \cdot V_{\text{Yeast}}}{\text{Yield}_{\text{Perox.}}}$$

to be $1.24E-14 \ \text{L}$, so approximately a fifth of the cell volume. Since in methylotrophic yeasts such as Pichia pastoris peroxisomes can take up to 80% of the total cell volume (Gleeson ), a fifth of the cell volume is very achievable. To further increase the possible yield our team is working on a Pex11-knockout that aims to increase peroxisome size

Background

Constraint based modeling

Constraint based modeling is a mechanistically method to simulate complex metabolic networks under the critical assumption that the network is in a steady state. It requires a reconstructed metabolic network, in which all reactions have been characterized stoichiometrically. The network is then represented by the stoichiometric matrix $S$, in which each column corresponds to a reaction and each row to a metabolite. The reactions are further constrained only allowing positive fluxes $v_i \geq 0$ (elementary modes) and by putting upper and lower bounds on the flux (turnover rate). This reduces the solution space into a convex polyhedral cone, in which every point is a non-negative combination of elementary modes and every edge a unique elementary mode. As metabolic networks typically have more reactions than metabolites, the systems are underdetermined and therefore the flux distribution is not definitive. To find the actual biological solution additional constraints have to be added, for example an objective function that is to be optimized.

Flux Balance Analysis

A sensible objective function should in our case describe the main driver behind evolution - natural selection. This can be accomplished by defining a biomass production term $Z = c^Tv$, which consists of several biomass precursor reactions merged into one unit. Therefore the optimization finds the flux distribution with which maximal biomass production is accomplished, which is in accordance to natural selection.

In matrix formalism this can be written as $Sv = 0$, where $S$ is the stoichiometric matrix and $v$ the vector of fluxes that is to be solved.$c$ is the vector that defines the weights of each flux in the objective function. The canonical form of a Flux Balance Analysis thus can be written as

$$\text{maximize } c^Tv$$

$$\text{subject to } Sv = 0$$

$$\text{and lower bound } \leq v \leq \text{upper bound}$$

This is a linear optimization which due to the convex solution space can be calculated efficiently with linear programming.

OptKnock

Metabolic engineering faces two big problems. Firstly the precursor molecule for the wanted reaction can only be found in trace amounts in the cell, leading to minor yields of the wanted product. Secondly adding pathways to the metabolism of an organism increases the metabolic stress on that organism. Thus if the cell loses the ability to produce the desired compound it can overgrow the other cells in a large-scale bioreactor setting. OptKnock (Anthony P. Burgard 2003) is a two step optimization in which biomass production is coupled to the production of a desired product by calculating a set of gene knockouts, solving both problems above.In principle the algorithm works like this:

$$ \left[ \begin{array}{ll} \text{Maximize Product} \\ \text{(over knockouts)} \\ \text{Subject to} & \left[ \begin{array}{ll} \text{Maximize biomass} \\ \text{(over fluxes)} \\ \text{Subject to} & \left[ \begin{array}{l} \text{Fixed substrate uptake rate} \\ \text{Network Stoichiometry} \\ \text{Knockouts identified by outer problem} \\ \end{array} \right] \\ \end{array} \right] \\ \text{Minimum biomass yield} \\ \text{N}_{\text{Knockouts}} \leq \text{ limit} \end{array} \right] $$

We visualized the concept in the following two pictures. The important point here is to notice that the maximal biomass flux in the OptKnock model at the same time corresponds to a high product flux.

Results

We analyzed the FPP pool by using Geranyltranstransferase (GRTT) as the reaction to be optimized. GRTT takes geranyl diphosphate and isopentenyl diphosphate and converts them to farnesyl pyrophosphate and diphosphate. The two results below show the maximal GRTT production with a minimal biomass yield of 0.1 and 0.2 respectively.

Compared to the wild type FPP flux (0.0013) the maximal fluxes that could be achieved with four knockouts were 0.61 and 0.29, respective to the minimal biomass yield of 0.1 and 0.2. This roughly corresponds to a 480- and 230-fold increase in FPP flux. While we do not think that these results are close to the actual biological truth, as a 230- or 480-fold increase is rather extreme, we did notice an accumulation of knockouts in genes related to the pentose phosphate pathway (TKT2 = Transketolase, G6PDH2r = Glucose-6-phosphate dehydrogenase) or citric acid cycle (ALATA_L = L-alanine transaminase). Further reactions like ADK3 (Adentylate kinase (GTP)), which is indirectly involved in the TCA by supplying GDP or IPPSm (2 isopropylmalate synthase mitochondrial) that takes acetyl-CoA as a substrate lead us to the conclusion that the cellular FPP pool can be increased by shifting to a fatty acid metabolism. This is further solidified by the fact that the FPP precursor isopentenyl diphosphate ultimately is build from acetyl-CoA Sebastián Rubat 2017.

We then integrated the nootkatone pathway into the metabolic model and reran the optimization.

The general trend of the knockouts continued in this model. In this fashion reactions in the TCA (MDH = Malate dehydrogenase), glycolysis (G6PI3 = Glucose 6 phosphate isomerase) and fermentation (PYRDC = Pyruvate decarboxylase) were knocked-out. This further solidifies our hypothesis, that the analysis builds a cell metabolism that favors fatty acid breakdown.