# Modelling

A major problem the project faced is that the comparison process of the fluorescence proteins wouldn't be possible to be investigated with all combinations as it would take too long.

To answer this problem, the team will attempt to model the fluorescence spectra over time expressed by the proteins given different. First, the type of gene expression would need to be identified and then, would be modified to considered the effects of inhibition and finally, be applied over time to see how much expression would occur at a certain time period. The team will use Mathematical modeling such as Ordinary Differential Equations because they are easy to convert into programming in order to build components for the simulation.

• Constitutive Gene Expression
• Absorption and Emission Wavelengths
• Gene Transcription Regulation by Repressors (CRISPR)
• Relationship between Max Fluorescence and Protein Concentration
• Are Our Constructions Random?

• ## Constitutive Gene Expression For Protein and mRNA Expression over Time

Biological insight had told us we need a model with constant gene expression. Investigating models from literature 1 so see which model would satisfy these conditions, and it was found the constitutive gene expression model was suitable to guide the model.

The first step was to take the general model from literature and apply it in our scenario using the proteins (GFP, ECHP, RFP.)

Figure 1 $$sfGFP \underset{Transcriptin}{\rightarrow} mRNA \underset{Translation}{\rightarrow} sfGFP$$

The equation above describes the process of which the gene undergoes transcription to produce mRNA. The mRNA carries the genetic information copied from the DNA which codes for protein. The expression of protein, can therefore, be measured by the fluorescence which is the desired output of the system.

Figure 2 $$mRNA \underset{Degradation}{\rightarrow} \oslash$$ $$sfGFP \underset{Degradation}{\rightarrow} \oslash$$

The two equations above state the same time, the concentration of protein and mRNA would undergo degradation which means the concentration would drop. However, since there is always protein and mRNA being created, over time, the creation and degradation keep the concentration constant. 2

We can apply Law of Mass Action combine both equations for the concentration of protein and mRNA over time. This model can be described as:

Figure 3 $$mRNA = k_{1} -d _{1 } mRNA$$ $$Protein = k_{2} \cdot mRNA - d_{2} \cdot Protein$$

Where...

• mRNA is the concentration of mRNA
• Protein is the concentration of Protein
• k 1 is the constitutive transcription rate. This represents the number of mRNA molecules produced per gene, per unit of time.
• d 1 is the mRNA degradation rate
• k 2 is the translation rate. This represents the number of protein molecules produced per mRNA molecule, per unit of time.
• d 2 is the protein degradation rate.

This is important because we can use this model to calculate the concentration of proteins we can expect over time. This is useful as we can use this information to calculate the total emitted light spectra during the time period which is what we are looking for in our system. However, the constants and variables are individual for each protein and which means parameters for each protein would need to be found. These constants were found using literature 3 (for GFP) and lab results (the rest.)

1 GB Stan, 20137. Modeling in Biology. London, the United Kingdom: Imperial College London. p, pp.59-65.

2 See Non-Inhibited conditions from Figure 5 Gene Transcription Regulation by Repressors (CRISPRi) - Concentration over Time

3 See Relationship between Max Fluorescence and Protein Concentration for more details

• ## Absorption and Emission Wavelengths From Given Concentrations of sfGFP, mRFP & ECFP

After concluding the general scheme we would be using, the team evaluated the selection of proteins. The proteins selected for the system use fluorescence, indicating they take in a light at a certain wavelength, and re-emit it at a different wavelength. This has to be considered because it informs the wet-lab in knowing which wavelengths are required to produce a spectra as well as highlighting the importance of considering any side effects from producing the spectra such as light being reabsorbed and re-emitted at a different wavelength / color, which would result in the spectra being similar to each other rather than unique.

In order to save time and program a model, the team used Shemrock's Online Fluorescence graph maker 1 which operated by taking in the expected Absorption wavelengths and emitting the Emission wavelengths expected by sfGFP (green), mRFP (red) and ECFP (blue) proteins. This was done through the Web App on the website. Furthermore, they provided the raw data in a text file format which was useful as it allows the team to read the data into a stand alone program.

Figure 4

This graph tells us the emitted light is expected to be at a higher wavelength than the absorbed wavelength. This must be considered in the model as there is overlap between emitted and absorbed wavelengths implying emitted light may be absorbed and re-emitted at a higher wavelength.

This model is important as it guides us when using wavelengths as parameters so we know which wavelengths to use, especially when trying to create a specific color as well as what wavelengths to look out for as they might cause overlap. This was very useful to the wet-lab as it informed them of what wavelengths to use as well as what wavelength range they should use to produce different fluorescence spectra.

1

• ## Gene Transcription Regulation by Repressors (CRISPRi) - Concentration over Time

The next step in developing our simulation was to calculate our protein concentration at any given time when using CRISPRi. Discussion with wet-lab revealed our method would be using CRISPRi as a repressor, which works by inhibiting the expression of one or more genes by binding to the promoter region 1 . The expanded mRNA and Protein concentration models from the Constitutive Gene Expression Model 2 were modified to include the element of repression from the CRISPRi inhibition.

$$Gene \overset{Repressor}{\rightarrow} mRNA \rightarrow Protein$$ $$mRNA \underset{Degradation}{\rightarrow} \oslash$$ $$sfGFP \underset{Degradation}{\rightarrow} \oslash$$
##### This change can be applied to the Law of Mass Action 3 :
$$m = k_{1} \cdot \frac{k^{n}}{k^{n} + R^{n}}- d_{1}m$$ $$p = k_{2} m - d_{2}p$$

Where...

m is mRNA concentration, p is Protein concentration, R is Repressor, k1 is Max Transcription Rate, k is the Repression Coefficient, n is number of repressors that need to cooperatively bind the promoter to trigger the inhibition of gene expression (Hill Coefficient), R is Repressor, d1 is mRNA degradation rate, d2 is Protein degradation rate

The value for these constants and variables were taken from literature and calculating them 4 but later, adjusted to the lab results.

Figure 6

Figure 6 shows the structure which underwent CRISPRi inhibition are expected to produce lower concentration of the protein whose expression were are inhibiting. This is important as it means the team can calculate concentration of proteins which are inhibited and compare them to the control conditions as well as giving the correct concentration for the simulation.

4 See Relationship between Max Fluorescence and Protein Concentration

• ## Relationship between Max Fluorescence and Protein Concentration

Fluorescence depends on a concentration of protein where the fluorescent protein is expressed over time. However, in order to find the value of constants for the modified Constitutive Gene Expression with CRISPRi 1 , calculations needed to be made. This meant we could develop a model which would suit the lab's needs.

The team looked into literature from lab results of similar studies frst 2 . This was useful because the team could develop a model and later substitute the lab results in to accurately fit the wet work's needs. This data underwent non linear regression.

Figure 7 $$y = k x ^ {n}$$

Fitting where after applying regression, it was found the graph followed a fit of:

Figure 8 $$y = 100.2 x ^{1.43154}$$

• ## Are our constructions random

When constructing our proteins with our current method, there were 3 vectors we could order from etc. we could have a combination of 1,2,3 or 1,1,1 etc.

However, in this proof of concept, order is irrelevant as each is either constitutively expressed or inhibited thus the system only has 2 3 combinations

Randomness comes from the fact the system relies on Brownian Motion, a random process to create these combinations.

However, in order for a movement to fall under Brownian Motion, it must fulfill a condition where the process must have continuous paths. This is not true as once the structures begin to form, the paths stop (they do not collide off each other elastically, but rather, combine.) Furthermore, the bacterium might become biased towards options that put less metabolic stress on the bacterium, which results in selection.