Line 288: | Line 288: | ||
}} | }} | ||
{{Heidelberg/templateus/Imagebox|https://static.igem.org/mediawiki/2017/a/ac/T--Heidelberg--2017_phage_titer_fig7.png|{{#tag:html|Figure 7: Shares of the phage-producing <i>E. coli</i> producing phages with different fitness values.}}|{{#tag:html|Initially all phage-producing <i>E. coli</i> produce phage with 20 % of the wildtype fitness, they make up the largest share for 1 h after which fitter phage-producing <i>E. coli</i> take over the lagoon. | {{Heidelberg/templateus/Imagebox|https://static.igem.org/mediawiki/2017/a/ac/T--Heidelberg--2017_phage_titer_fig7.png|{{#tag:html|Figure 7: Shares of the phage-producing <i>E. coli</i> producing phages with different fitness values.}}|{{#tag:html|Initially all phage-producing <i>E. coli</i> produce phage with 20 % of the wildtype fitness, they make up the largest share for 1 h after which fitter phage-producing <i>E. coli</i> take over the lagoon. | ||
− | <br> | + | <br> |
<a href="#parameters">See the full list of parameters.</a> | <a href="#parameters">See the full list of parameters.</a> | ||
}} | }} |
Revision as of 20:25, 29 October 2017
Modeling
Phage Titer
While developing PREDCEL in the lab, we simultaneously developed it in silico so that both sides could benefit from each other. One of the most important parameter of phage assisted directed evolution experiments like PREDCEL and PACE is the phage titer itself. If the phage titer drops washout can occur and the experiment has to be restarted with the disadvantage of loosing library complexity. If the phage titer increases too much, the multiplicity of infection (MOI), that means the amount of phage relative the the amount of E. coli rises too. If fore example the MOI is 10 and an E. coli can only be infected by one phage, nine out of ten phages will not infect an E. coli and thus will not evolve, but still make up most of the phage population.
Comparing the plots for 20 % and 100 % fitness show that a higher fitness increases the final phage titer and decreases the amount of uninfected E. coli earlier.
Modeling concentrations in one Lagoon
Here the concentrations \(c\) of uninfected E. coli, infected E. coli and phage producing E. coli as well as the M13 phage are modeled. They are denoted with the subscripts \(_{u}\), \(_{i}\), \(_{p}\) and \(_{P}\). If the whole E. coli population is referred to, \(c_{E}\) is used. If an arbitrary E. coli population is meant, the subscript \(_{e}\) is used. The phage concentration \(c_{P}\) refers to the free phage only, phage that are contained in an E. coli they infected are not included. The used parameters include the time \(t\), the affinity of phage for E. coli \(k\), the duration between infection of an E. coli and the first phage leaving the E. coli \(t_{P}\). The three different E. coli populations each have a generation time \(t\) that is denoted with their subscript. The fitness of a phage population is \(f\).Table 1: Variables and Parameters used in this model List of all paramters and variables used in this model. When possible values are given.
Symbol | Name in source code | Value and Unit | Explanation |
---|---|---|---|
\(c \) | - | [cfu] or [pfu] | colony forming units for E. coli (cfu) or plaque forming units (pfu) for M13 phage |
\( _u\) | - | - | Subscript for uninfected E. coli |
\( _i\) | - | - | Subscript for infected E. coli |
\( _p\) | - | - | Subscript for phage-producing E. coli |
\( _e\) | - | - | Subscript any the of E. coli populations on its own |
\( _E\) | - | - | Subscript for all populations of E. coli together |
\( _P\) | - | - | Subscript for M13 phage |
\(c_{c} \) | capacity |
[cfu/ml] | Maximum concentration of E. coli possible under given conditions, important for logistic growth |
\(t\) | t |
[min] | Duration since the experiment modeled was started |
\(t_{u} \) | tu |
\(20\) min | Duration one division of uninfected E. coli |
\(t_{i} \) | ti |
\(30\) min | Duration one division of infected E. coli |
\(t_{p} \) | tp |
\(40\) min | Duration one division of phage producing E. coli |
\( t_{P}\) | tpp |
[min] | Duration between an E. coli being infected by an M13 phage and releasing the first new phage |
\(g_{e} \) | e_growth_rate |
[cfu/min] | Growth rate of E. coli, depending on the type of growth (either logistic or exponential), the current concentration \(c_{e}\), the maximum concentration \(c_{c}\), and the generation time \(t_{e}\) |
\( k\) | k |
\(3 \cdot 10^{-11}\frac{1}{cfu \cdot pfu \cdot ml \cdot min}\) | Affinity of M13 phage for E. coli |
\( \mu_{max}\) | mumax |
\(16.67 \frac{cfu}{min \cdot ml \cdot cfu}\) | Wildtype M13 phage production rate |
\( f\) | f |
? | Fitnessvalue, fraction of actual \(\mu\) and \(\mu_{max}\) |
Each term describing the change of an E. coli concentration contains its growth, \(g_{e}\). The growth rate of an E. coli population can be modeled by exponential growth or by logistic growth. Especially, when long durations per lagoon are modeled, the logistic growth model is more exact. [source].
In the exponential case the growth rate \(g_{e}\) is modeled as
$$
g_{e} (t_{e}) = c_{e} \cdot \frac{log(2)}{t_{e} }
$$
Note that the growth rate in the model increases over time, while in the modeled culture, the nutrient concentration decreases.
That makes the logistic model more plausible, it models \(g_{e}\) as
$$
g_{e} (t_{e}, \: c_{e}(t), \: c_{c}) = \frac{c_{c} - c_{e} (t)}{c_{c} } \cdot \frac{log(2)}{t_{e} }
$$
In this case the learning rate decreases as the current concentration \(c_{e}\) approaches the maximum capacity for E. coli in the given setup \(c_{c}\). With this model \(c_{e} \leq c_{c}\) is true for any point in time.
Change of concentration of uninfected E. coli, \(\frac{\partial c_{u} }{\partial t} \: [cfu/min]\)
$$
\frac{\partial c_{u} }{\partial t}(t) = g_{u} (t_{u}, \: c_{u}(t), \: c_{c})
- k \cdot c_{u}(t) \cdot c_{p}(t)
$$
In addition to the growth term, the concentration of uninfected E. coli is described by a term for infection that takes into account the concentration of uninfected E. coli and the concentration of free phage and reduces the conentration of uninfected E. coli.
Change of concentration of uninfected E. coli, \(\frac{\partial c_{i} }{\partial t} \: [cfu/min]\)
$$
\frac{\partial c_{i} }{\partial t}(t) = \begin{cases}
g_{i} (t_{i}, \: c_{i}(t), \:c_{c})
+ k \cdot c_{i}(t) \cdot c_{p}(t)
- c_{i}(t - t_{P}),
\quad \text{for} \: t > t_{P} \\
g_{i} (t_{i}, \: c_{i}(t), \: c_{c})
+ k \cdot c_{i}(t) \cdot c_{p}(t),
\quad \text{otherwise}
\end{cases}
$$
Until \(t > t_{P}\) the concentration of infected E. coli increases by growth and infection of previouly uninfected E. coli. When \(t > t_{P}\), a third term describing that infected E. coli turn into phage-producing E. coli is subtracted.
Change of concentration of phage producing E. coli, \(\frac{\partial c_{p} }{\partial t} \: [cfu/min]\)
$$
\frac{\partial c_{p} }{\partial t}(t) = \begin{cases}
g_{p} (t_{p}, \: c_{p}(t), \: c_{c}) -
c_{i}(t - t_{P}),
\quad \text{for} \: t > t_{P} \\
g_{p} (t_{p}, \: c_{p}(t), \: c_{c}),
\quad \text{otherwise}
\end{cases}
$$
The population of phage producing E. coli only increases by growth until \(t > t_{P}\). When infected E. coli drop their first phage they turn into producing E. coli as described by the second term.
Change of concentration of M13 phage, \(\frac{\partial c_{P} }{\partial t} \: [cpu/min]\)
$$
\frac{\partial c_{P} }{\partial t}(t) = c_{P}(t) \cdot \mu_{max} \cdot f - k \cdot c_{u}(t)\cdot c_{P}(t)
$$
The phage concentration is only increased by phage that leave phage-producing E. coli, which happens at a rate of \(f \cdot \mu_{max}\) per time unit, with f being the fitness, a value between 0 and 1, equal to the share of the wildtype M13 phages fitness and \(\mu_{max}\) being the wildtype phages production rate. We assume that the only negative influence on the free phage titer is phage infecting E. coli, which depends on both the phage titer \(c_{P}\) and the titer of uninfected E. coli, \(c_{i}\).
When the fitness is assumed to be the same for all phages, it is modeled to be constant during the time in one lagoon.
If a fitness distribution is assumed, the fitness does not necessarily stay constant in one lagoon. The distribution is initialised with 100 % of the phages having the starting fitness. First changes in the fitness distribution occur after the first E. coli start to release phages.
The fitness distribution is modeled to change by mutation and selection. The selection is implemented in the following equation: $$ s_{i}(t_{n + 1}) = f_{i}(t_{n}) \cdot s_{i}(t_{n}) $$ \(f_{i}\) is one of \(N\) fitness values, \(s_{i}\) is the share of phages with that fitness value relative to the total phage population. Since the fitness is interpreted as a percentage of the wildtype fitness in this model, the new shares can be calculated als the product of the fitness value \(f_{i}\) and its old share \(s_{i}(t_{n})\). The resulting values are then normalized by divsion with $$ \sum_{i = 0}^{N} f_{i}(t_{n + 1}) $$ in order to always have shares that sum up to 1.
The mutation is implemented as a function that manipulates the fitness shares and applied to the fitness shares after selection. Here, \(r_{M}\) is the mutation rate, that means the amount of the fitness share \(i\) that changed during the mutation. \(w\) is the width of a fitness bin: $$ w = \frac{1}{N - 1} $$ The new shares are calculated as $$ s_{i}(t_{n + 1}) = \big(1 - r_{M}\big) \cdot s_{i}(t_{n}) + r_{M} \sum_{i = 0}^{N} \int_{f_{i} - \frac{1}{2} \cdot w}^{f_{i} + \frac{1}{2} \cdot w} \mathcal{N}_{\mu=f_{i}, \: \sigma}(x) \: dx $$ The normal distribution \(\mathcal{N}\) is used to model that many mutations have only small effects on the fitness, while some have a larger effect. \(\sigma\) is a parameter of the model that influences how much the new scores differ from the previous ones. In this model it is theoretically possible that mutation turns any given fitness into any other fitness, however small values for can \(\sigma\) prevent this.
If a fitness distribution is assumed, the fitness does not necessarily stay constant in one lagoon. The distribution is initialised with 100 % of the phages having the starting fitness. First changes in the fitness distribution occur after the first E. coli start to release phages.
The fitness distribution is modeled to change by mutation and selection. The selection is implemented in the following equation: $$ s_{i}(t_{n + 1}) = f_{i}(t_{n}) \cdot s_{i}(t_{n}) $$ \(f_{i}\) is one of \(N\) fitness values, \(s_{i}\) is the share of phages with that fitness value relative to the total phage population. Since the fitness is interpreted as a percentage of the wildtype fitness in this model, the new shares can be calculated als the product of the fitness value \(f_{i}\) and its old share \(s_{i}(t_{n})\). The resulting values are then normalized by divsion with $$ \sum_{i = 0}^{N} f_{i}(t_{n + 1}) $$ in order to always have shares that sum up to 1.
The mutation is implemented as a function that manipulates the fitness shares and applied to the fitness shares after selection. Here, \(r_{M}\) is the mutation rate, that means the amount of the fitness share \(i\) that changed during the mutation. \(w\) is the width of a fitness bin: $$ w = \frac{1}{N - 1} $$ The new shares are calculated as $$ s_{i}(t_{n + 1}) = \big(1 - r_{M}\big) \cdot s_{i}(t_{n}) + r_{M} \sum_{i = 0}^{N} \int_{f_{i} - \frac{1}{2} \cdot w}^{f_{i} + \frac{1}{2} \cdot w} \mathcal{N}_{\mu=f_{i}, \: \sigma}(x) \: dx $$ The normal distribution \(\mathcal{N}\) is used to model that many mutations have only small effects on the fitness, while some have a larger effect. \(\sigma\) is a parameter of the model that influences how much the new scores differ from the previous ones. In this model it is theoretically possible that mutation turns any given fitness into any other fitness, however small values for can \(\sigma\) prevent this.
When the the concentrations are modeled over a longer period of time the fitness of the wildtype is reached. To more accurate model evolution in which the wildtype fitness is never reached, \(\mu_{max}\) can be set to a smaller value. This value will be reached by most of the phages, if enough time is given. Finally almost no phages with the lowest fitness can be found in the lagoon.
Since most mutations decrease fitness in reality, a symmetric function like the normal distributions density function may not be the best choice. Consequently we also implemented a mutation function with a skew normal distribution \(\mathcal{S}\), that replaces \(\mathcal{N}\). It has an additional parameter \(\alpha\) that modulates the skewness.
$$
s_{i}(t_{n + 1}) = \big(1 - r_{M}\big) \cdot s_{i}(t_{n}) + r_{M} \sum_{i = 0}^{N} \int_{f_{i} - \frac{1}{2} \cdot w}^{f_{i} + \frac{1}{2} \cdot w} \mathcal{S}_{\mu=f_{i}, \: \sigma, \: \alpha}(x) \: dx
$$
The equation for \(\frac{\partial c_{P} (t)}{\partial t}\) is changed into the following to incorporate the distributed fitness.
$$
\frac{\partial c_{P} (t)}{\partial t} = -k \cdot c_{u}(t) \cdot c_{P} (t)
+ \sum_{i = 0}^N f_{i} \cdot s_{i} \cdot \mu \cdot c_{p} (t)
$$
Other assumptions for the mutation function are possible, for example a function that randomly samples from a normal distribution but is biased towards mutations that decrease fitness.
Since this model of fitness and mutation is heavily abstracted it is most likely limeted to qualitative statements. Still it provides an intuitive understanding on how directed evolution can work and illustrates the role of selection and mutation.
Since this model of fitness and mutation is heavily abstracted it is most likely limeted to qualitative statements. Still it provides an intuitive understanding on how directed evolution can work and illustrates the role of selection and mutation.
Modeling concentrations over multiple Lagoons
When transfer from one volume to the next is performed, new lagoon can be modeled with starting values calculated from the last lagoons end values. For each concentration from the previous lagoon \(c_{t}\), the concentration in the next lagoon \(c_{t+1}\) is calculated as $$ c_{t+1} = \frac{v_{t} }{v_{l} } \cdot c_{t} $$ with \(v_{l}\), the volume of a lagoon and \(v_{t}\), the volume that is transferred. If the transfered volume is spinned down before it is added to the new lagoon, the initial value for \(c_{P}\) is calculated this way. The initial concentration of uninfected E. coli is set to the initial cell density. Initial concentrations of infected and phage-producing E. coli are set to zero, because before the transfer, no phages are present in the new lagoon. If the transfer volume is not spinned down, the concentration of infected and phage-producing E. coli are calculated, using the above formula. The initial concentration of uninfected E. coli is the calculated the same way, but the initial cell density is added. In directed evolution the fitness should increase over time. A linear increase in fitness between to given values was implemented to show this. The problem with this approach is its basic assumption being that all phage-producing E. coli are infected by phages with the same fitness. To make the model more plausible, a distribution of fitness was introduced. For a set of discrete fitness values each fitness values share of the phage-producing E. coli population is calculated. That changes the equation for the change in the concentration of phage-producing E. coli to The calculation is for \(N\) different fitness values \(f_{i}\) and their share of the total phage-producing E. coli population \(s_{i}\).Numeric solutions
The problem described above is a system of four differential equations, of which two ( \(\frac{\partial c_{i} }{\partial t} \:, \: \frac{\partial c_{p} }{\partial t}\) ) are so called delayed differential equations. They contain a term that needs to be evaluated at a timepoint in the past \(t - t_{P}\). A custom script was used to solve the problem numerically, using the explicit Euler method. The basic idea is that from a point in time with all values and all derivatives values given, the next point in time can be calculated by assuming a linear progress between the two points. $$ f(t_{n+1}) = f(t_{n}) + (t_{n+1} - t_{n}) \cdot f'(t_{n}) $$ This is performed for \(c_{u}(t)\), \(c_{i}(t)\), \(c_{p}(t)\) and \(c_{P}(t)\) rotatory, to always have the needed values from \(t_{n}\) ready for \(t_{n+1}\). To explore, how unprecise parameters and noise influence the outcome of the model, a mode was implemented, that adds gaussian noise to all parameters. It uses the function \(n\) that makes a value \(v\) noisy with a random parameter \(r\). $$ n(v) = \big(1 - 2r\big) \cdot \sigma_{G} \cdot \sigma_{v} \cdot v, \quad r \in (0, 1) $$ Here, \(\sigma_{G}\) is a factor that is the same for all \(v\), \(\sigma_{v}\) is specific for \(v\). This way, it is possible to have one parameter being noisier than another, while being able to tune the noise globally. [Results]Table 2: Additional Variables and Parameters used in the numeric solution of the model List of all additional paramters and variables used in the numeric solution of this model. When possible values are given.
Symbol | Name in Source code | Value and Unit | Explanation |
---|---|---|---|
\(v_{l}\) | vl |
[ml] | Volume of lagoon |
\(t_{l} \) | tl |
[min] | Duration until transfer to the next lagoon |
\(c_{u}(t_{0})\) | ceu0 |
[cfu] | Concentration of E. coli in a lagoon when M13 phages are transfered to it |
\(c_{P}(t_{0})\) | cp0 |
[pfu] | Initial concentration of M13 phage in the first lagoon |
\(n\) | epochs |
- | Number of epochs that are modeled, one epoch being everything that happens in one particular lagoon |
\(s\) | tsteps |
- | Number of time steps for which numeric solutions are calculated, counted per epoch |
\(c_{P}^{min}\) | min_cp |
[pfu] | Lower threshold for valid phage titers |
\(c_{P}^{max}\) | max_cp |
[pfu] | Upper threshold for valid phage titers |
Parameters used for the figures
Figure 1, 3 - 20 % fitness
'capacity': 1000000000.0,
'ceu0': 100000000.0,
'cp0': 100000000.0,
'epochs': 1,
'f0': 0.2,
'f_prec': 21,
'fend': 1.0,
'fitnessmode': 'dist',
'ftype': 'const',
'growth_mode': 'exp',
'k': 3e-11,
'max_cp': 2000000000.0,
'min_cp': 100000.0,
'mumax': 16.667,
'mutation_dist': 'norm',
'noisy': 0.0,
'phageonly': 'True',
'plot_dist': 'True',
'sigma': 0.0,
'skewness': 1.0,
'ti': 30,
'tl': 60,
'to_mutate': 0.0,
'tp': 40,
'tpp': 10,
'tsteps': 100,
'tu': 20,
'vl': 20,
'vt': 1
Figure 2 - 100 % fitness
'capacity': 1000000000.0,
'ceu0': 100000000.0,
'cp0': 100000000.0,
'epochs': 1,
'f0': 1.0,
'f_prec': 21,
'fend': 1.0,
'fitnessmode': 'const',
'ftype': 'const',
'growth_mode': 'exp',
'k': 3e-11,
'max_cp': 2000000000.0,
'min_cp': 100000.0,
'mumax': 16.667,
'mutation_dist': 'norm',
'noisy': 0.0,
'phageonly': 'True',
'plot_dist': 'False',
'sigma': 0.001,
'skewness': 1.0,
'ti': 30,
'tl': 60,
'to_mutate': 0.0,
'tp': 40,
'tpp': 10,
'tsteps': 100,
'tu': 20,
'vl': 20,
'vt': 1
Figure 4 - Distributional fitness 1 h
'capacity': 1000000000.0,
'ceu0': 100000000.0,
'cp0': 100000000.0,
'epochs': 1,
'f0': 0.2,
'f_prec': 21,
'fend': 1.0,
'fitnessmode': 'dist',
'ftype': 'const',
'growth_mode': 'exp',
'k': 3e-11,
'max_cp': 2000000000.0,
'min_cp': 100000.0,
'mumax': 16.667,
'mutation_dist': 'norm',
'noisy': 0.0,
'phageonly': 'True',
'plot_dist': 'True',
'sigma': 0.0,
'skewness': 1.0,
'ti': 30,
'tl': 60,
'to_mutate': 0.0,
'tp': 40,
'tpp': 10,
'tsteps': 100,
'tu': 20,
'vl': 20,
'vt': 1
Figure 5, 6, 7 - Distributional fitness 6.5 h
'capacity': 1000000000.0,
'ceu0': 100000000.0,
'cp0': 100000000.0,
'epochs': 1,
'f0': 0.2,
'f_prec': 21,
'fend': 1.0,
'fitnessmode': 'dist',
'ftype': 'const',
'growth_mode': 'logistic',
'k': 3e-11,
'max_cp': 2000000000.0,
'min_cp': 100000.0,
'mumax': 16.667,
'mutation_dist': 'norm',
'noisy': 0.0,
'phageonly': 'True',
'plot_dist': 'True',
'sigma': 0.1,
'skewness': 1.0,
'ti': 30,
'tl': 400,
'to_mutate': 0.1,
'tp': 40,
'tpp': 10,
'tsteps': 2000,
'tu': 20,
'vl': 20,
'vt': 1