Team:UNOTT/Modelling

Modelling

During the plasmid constructions, the wet lab needed to know what to expect and we needed to be able to test every combination. Modeling allowed us to do this easily, within time the constraints and safely.

Download our source code from our gitHub page

Constitutive Gene Expression
Absorption and Emission Wavelengths
Gene Transcription Regulation by Repressors (CRISPR)
Relationship between Max Fluorescence and Protein Concentration
Are Our Constructions Random?

Constitutive Gene Expression For Protein and mRNA Expression over Time

During discussions with wet-lab, we concluded that the gene expression would be unregulated and the gene would always be activated. After reading literature so see which model would satisfy these conditions, and it was found the constitutive gene expression model was suitable to guide the model.

The first step was to take the general model from literature and apply it in our scenario using the proteins (GFP, ECHP, RFP.)

$$ sfGFP \underset{Transcriptin}{\rightarrow} mRNA \underset{Translation}{\rightarrow} sfGFP $$ $$ mRNA \underset{Degradation}{\rightarrow} \oslash $$ $$ sfGFP \underset{Degradation}{\rightarrow} \oslash $$
This tells us that the gene would undergo transcription to produce mRNA and the mRNA would undergo translation to become proteins which would be expressed and output fluorescence. At the same time, the concentration of protein and mRNA would undergo degradation and decrease to 0.

The second step would be to use the Law of Mass Action to add the degradation component for a more accurate concentration for protein and mRNA overtime. This model can be described as:
$$ mRNA = k_{1} -d _{1 } mRNA $$ $$ Protein = k_{2} \cdot mRNA - d_{2} \cdot Protein $$

This is important because we can use this model to calculate the concentration of proteins we can expect over time. This is useful as we can use this information to calculate the total emitted light spectra during the time period which is what we are looking for in our system. However, the constants and variables are individual for each protein and which means each protein would need it's own model to describe the behavior. These constants were found using literature (for GFP) and lab results (the rest.)

Absorption and Emission Wavelengths From Given Concentrations of sfGFP, mRFP & ECFP

After concluding the general scheme we would be using, the team evaluated the selection of proteins. The proteins selected for the system use fluorescence, which means they take in a light at a certain wavelength, and re-emit it at a different wavelength. This has to be considered because it informs the wet-lab in knowing which wavelengths are required to produce a spectra as well as highlighting the importance of considering any side effects from producing the spectra such as light being reabsorbed and re-emitted at a different wavelength / color, which would result in the spectra being similar to each other rather than unique.

In order to save time and program a model, the team used Shemrock's Online Fluorescence graph maker which operated by taking in the expected Absorption wavelengths and emitting the Emission wavelengths expected by sfGFP (green), mRFP (red) and ECFP (blue) proteins. This was done through the Web App on the website. Furthermore, they provided the raw data in a text file format which was useful as it allows the team to read the data into a stand alone program./p>
This graph tells us the emitted light is expected to be at a higher wavelength than when absorbed. This must be considered in the model as there is overlap between emitted and absorbed wavelengths so some emitted light may be absorbed and re-emitted at a higher wavelength.

This model is important as it guides us when using wavelengths as parameters so we know which wavelengths to use, especially when trying to create a specific color as well as what wavelengths to look out for as they might cause overlap. This was very useful to the wet-lab as it informed them of what wavelengths to use as well as what wavelength range they should look out for.

Gene Transcription Regulation by Repressors (CRISPR) - Concentration over Time

The next step in developing our simulation was to calculate our protein concentration at any given time when using CRISPR. Discussion with wet-lab revealed our method would be using CRISPR as a repressor, which works by inhibiting the expression of one or more genes by binding to the operator. The expanded mRNA and Protein concentration models from the Constitutive Gene Expression Model were modified to include the element of repression from the CRISPR inhibition.
$$ Gene \overset{Repressor}{\rightarrow} mRNA \rightarrow Protein $$ $$ mRNA \underset{Degradation}{\rightarrow} \oslash $$ $$ sfGFP \underset{Degradation}{\rightarrow} \oslash $$
This change can be applied to the Law of Mass Action:
$$ m = k_{1} \cdot \frac{k^{n}}{k^{n} + R^{n}}- d_{1}m $$ $$ p = k_{2} m - d_{2}p $$
Where...

m is mRNA concentration, p is Protein concentration, R is Repressor, k1 is Max Transcription Rate, k is Repression Coefficient, n is Hill Coefficient (number of repressors that need to cooperatively bind the promoter to trigger the inhibition of gene expression), R is Repressor, d1 is mRNA degradation rate, d2 is Protein degradation rate

When visually modeled using Python:

This tells the team that constructions which underwent CRISPR inhibition are expected to produce lower concentration of the protein whose expression were are inhibiting. This is important as it means the team can calculate concentration of proteins which are inhibited and compare them to the control conditions as well as giving the correct concentration for the simulation.

Relationship between Max Fluorescence and Protein Concentration

In order to calculate sample constants before the lab results were in, we looked into literature from lab results of similar studies. This data underwent non linear interpolation where the data was graphed first and as the graph resembled a:

$$ y = k x ^ {n} $$

Fitting where after applying regression, it was found the graph followed a fit of:

$$ y = 100.2 x ^{1.43154} $$

Are our constructions random

When constructing our proteins with our current method, there were 3 vectors we could order from

$$ \textrm{sgRNA plasmid} \left\{\begin{matrix} 1 & 2 & 3\\ 1 & 1 & 1\\ 1 & 2 & 3 \end{matrix}\right. $$ $$ \textrm{etc.} \therefore \textrm{there are 64 variations of arrangement} $$ $$ \therefore \textrm{1 / 64 chance of each variation, which is randomly constructed} $$ $$ \textrm{Order of Plasmid Bricks} \begin{Bmatrix} 1 & 2 & 3 \\ 2 & 1 & 3 \\ 3 & 2 & 1 \\ 3 & 1 & 2 \\ 2 & 3 & 1 \\ 1 & 3 & 2 \\ \end{Bmatrix} $$

Types of brick used

1 in 12 promoters per brick
1 in 3 terminators per brick
1 in 3 fluorescent per brick
1 in 102 proteins per brick

Therefore any combination is equal to sgRNA vector chances of 1 in 64

Times order 1 in 6

Types of brick used Times brick in 1 in 102

Therefore 102 x 6 x 64, any combination has the probability of 1 in 39168

Randomness comes from the fact the system relies on Brownian Motion, a random process to create these combinations.

However, in order for a movement to fall under Brownian Motion, it must fulfill a condition where the process must have continuous paths. This is not true as once the structures begin to form, the paths stop (they do not collide off each other elastically, but rather, combine.) Furthermore, there would be no transposition once it's in the bacterium otherwise it would become biased towards options that put less metabolic stress on the bacterium.