Team:UNOTT/Software

Image Comparison Software

Comparing images of spectra from two different colonies to check for similiarity

Key.Coli Verification

Comparing the raw data of two different colonies straight from the fluorescence reader

Another method of comparing fluorescence spectra is by taking raw data and comparing them cell by cell. During the development of the software, the team found that the data held the same format in terms of spacing when outputted by the fluorescence reader. This made it far easier to write a data comparison algorithm.

By using Java and working with the libraries which support the spreadsheet format, the team was able to directly compare sets of data by calling for values from each cell and calculating the difference. This was then checked with a threshold value; if it is above the threshold value, it fails the check and the user is locked out.

A threshold value is how much variation the colony can have from the mother colony before it isn't valid. An issue with this as time goes on, the threshold value will have to change to catch a larger variation because the longer the colony is away from the mother colony, the more different it becomes. In order to calculate a threshold value at any given time, a Polynomial Fit of Order 3 is calculated using the data from the mother colony. To calculate the Polynomial Fit, Figure 6 was translated into Java code.

Figure 6

$$ \color{white}{ y_{i} = \beta_{0} + \beta_{1}x_{i}+ \beta_{1}x^2_{i} + ... + + \beta_{m}x^m_{i} + \varepsilon_{i} (i = 1,2,...,n) } $$

Which can be expanded as in Matrix Notation:

$$ \color{white}{ \begin{bmatrix} \\ y_1 \\ y_2 \\ y_3 \\ \vdots \\ y_n \end{bmatrix} = \begin{bmatrix} 1 & x_1 & x^2_1 & ... & x^m_1\\ 1 & x_2 & x^2_2 & ... & x^m_2\\ 1 & x_3 & x^2_3 & ... & x^m_3 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & x^2_n & ... & x^m_3 \end{bmatrix} \begin{bmatrix} \\ \beta_1 \\ \beta_2 \\ \beta_3 \\ \vdots \\ \beta_n \end{bmatrix} + \begin{bmatrix} \\ \varepsilon _1 \\ \varepsilon_2 \\ \varepsilon_3 \\ \vdots \\ \varepsilon_n \end{bmatrix} } $$

Which can be simplified to

$$ \color{white}{ \vec{y} = X \vec{\beta} + \vec{ \varepsilon } }$$

Where...

$$ \color{white}{ \varepsilon \text{ is the y-intercept} } $$ $$ \color{white}{ X \text{ represents a design matrix which holds a set of objects} } $$ $$ \color{white}{ y \text{ holds the value of the dependent variable} } $$ $$ \color{white}{ \beta \text{ denotes the slope of the line } } $$

This was implemented through the use of For Loops to cycle through an array of data points. This was done only with the mother colony to create a threshold for each data point. This was done by using the Polynomial Fit to output an equation for the fit which used a variable X as it's input. X was simply substituted for the data point value that was used to create the fit to create a threshold value. This threshold value could be adjusted by adding or subtracting from it. The Key.Coli intensity was compared to this threshold value using Selection statements; if the key colony's data point wasn't within the upper limit or lower limit, they were locked out. The team decided it would be appropriate to use Polynomial Fitting as it was found to follow the points the closest when graphed in Excel.

The system won't let the user in as the colonies are too different

The system lets the user in as the colonies are nearly identical and falls within threshold

Download our source code

Fluorescence Spectra Simulation

Simulating fluorescence spectra from given protein concentrations

Random Number Generation

Generating random numbers from our randomly constructed colonies

When speaking to our industry contacts about Key.Coli, they were very interested in seeing Key.Coli's capabilities as a Random Number Generation tool. After gaining results from the random constructions, the team set to finding out if the values could be used to generate a string of random numbers. The importance of Key.Coli's core value of informational security is the ability to produce a persistent but randomly generated state.

There were two ways to generate a string of random numbers from the colonies: either try to generate a string of numbers from a three colonies (one acting as a Minimum and another as a Maximum for range) or treat each colony as a random number by itself

On investigating the first method, out of 3 colonies, one was assigned the role of being the Minimum, having the lowest fluorescence intensity, and another a Maximum, having the highest. Using the equation: INT((MED[...] - MIN[...]) / MAX[...]) * 255), the team could generate a string of random numbers by inputting the fluorescence values over time. The INT(..) command sets the number to an Integer value, so no decimal points would appear. The MIN colony returns the colony with the lowest intensity, MAX returns the highest, and MED returns the median. The result was multiplied by 255 to produce a number out of 255, which is the largest value of a byte. The results are shown on Figure 8

Figure 8

For comparison, the other values are taken from other pseudo-random and random number generators ¹

Looking at the graph, it shows that Key.Coli generated numbers tend to a lower range of numbers. This was confirmed when checking for a normal distribution; it was found the set of numbers were biased to the the top and bottom of the set ranges, which suggests that three colonies scaled over time cannot be used to generate a set of numbers.

In the other method, the team found that one colony can be used for one number (in our case, it was 18, 128 and 125.) Theoretically, these are random as the colonies were constructed in a random fashion using Brownian Motion ². However, due to time and resource constraints, it would be impossible to create 200,000 colonies required for testing currently but this maybe achievable through automation.

This is still very useful: it means it can be used as a random seed value for a random number generator. Furthermore, one way to get a set of numbers from one colony is to break it up. The team did this and found each colony, despite being genetically similar, had varying levels of fluorescence from each other.

However, future projects can feel free to use Key.Coli to generate true random numbers. In the future, the team would like to investigate the random nature of the key.coli system more thoroughly.

¹ RAND was generated using =INT(NORM.INV(RAND(),XX,XX)) on Excel, Atmospheric Noise was taken from Random.org and Fortuna is used in some Unix based OS to generate security keys

² See Modelling's Are Our Constructions Random

Linux Key.Coli Security Layer

Porting our comparison software to low end hardware to safeguard a system

As a final wrap up for the project, all the software and modeling was put together to create an additional security layer on top of Linux for the Raspberry Pi. A Raspberry Pi is a super low budget low-end computer which is favoured by enthusiasts and computer hobbyists which is designed to be programmed easily for, as the hardware comes unlocked. The reason we chose this was because it would give us the least issues when it came to editing the security protocols of Linux.

Raspberry Pi

This was done to show people how Key.Coli could be used to secure your computer from strangers who don't have the Key.Coli but might know your password as well as giving us a physical demonstration of Key.Coli to show at the Jamboree.

The system was designed as a program that would load when Linux booted up. The system works by locking out the user by shutting down all possible inputs other than an input from the reader. In order to "unlock" your computer, the user would need to connect the Raspberry Pi to two different fluorescent readers: one for the mother colony, and one for the Key.Coli mechanism. The data sets from both colonies are compared ¹ and if it matches the threshold, the computer unfreezes.

The usage is shown on a video here:

In the video, the Key.Coli software can be seen running. This secures the computer; as the user tries to move their mouse, it won't operate as all inputs have been blocked. A prompt on screen tells the user to plug in a fluorescence reader. Since we are not in lab, the user instead used a USB stick with the fluorescence data to represent the fluorescence reader. Later on, this will be patched so it can only take inputs from a reader. Once inputted, the data from the reader is compared with another data-set from the mother colony and the user is let through. In the video, you can see that as the user is able to move their mouse again.

Using the Key.Coli Verification software developed and modifying it to support the file system on the Pi, this successfully happened.

However, due to health and safety regulations, for the Jamboree, we read data from conditions in lab and stored them on USB sticks, acting as the Key.Coli and mother colony. This is similar to how the actual system would work, except it has USB drives instead of fluorescence readers

SOFTWARE

Overview

About our software and why iGEM Nottingham chose to produce it