Team:SVCE CHENNAI/Software



Promoter Strength Predictor

One of the first things an iGEM team does once they have formalized an idea is to choose a promoter according to their needs. There are occasions when a team might require a weak promoter or a one that has a medium level of expression or one with high expression levels based on their project. Teams spend often a significant amount of time and resources on characterizing their promoter using a fluorescent protein. SVCE’s iGEM team is here to ameliorate that situation. We have built a machine learning based tool that’s capable of predicting the strength of sigma 70 promoters is E.coli.

How is this tool tailor-made for iGEM teams?

This tool is perfect for iGEM teams as the training data set we used is the 19 Anderson promoters. The output generated by the tool is easy to interpret as it’s relative to the strongest Anderson promoter.

How does it work?

The machine learning algorithm used is multi-variate linear regression with the parameters optimized through gradient descent. The two variables of the model are the -10 and -35 hexamers of the sigma seventy promoters. The output values are taken as natural logarithm of the relative strength of the promoter to the strongest Anderson promoter. A good correlation co-efficient of 0.69 is observed on the cross-validation set indicating that the platform works as expected.

How to use it?

The user will have to sequence gaze and identify the -10 and -35 hexamers of the promoters strength they intend to predict. The output is then compared to the value which is closest to strength of one of the Anderson promoters to understand and interpret its expression levels.
GITHUB - Promoter Strength Predictor


Given how often teams use different riboswitches as regulatory components of their project, we built a riboswitch class classifier. The current iteration of the platform is capable of identifying 4 classes: Purine riboswitches, Lysine riboswitches, Molybdenum co-factor riboswitches and SAM IV riboswitches.

How does it work?

A feed-forward neural network with the backpropogation algorithm was the machine learning algorithm used. The feature vector consists of the mono-nucleotide and di-nucleotide frequencies of the riboswitches. The activation function used was the tanh function. Upon sequence inupt verification of riboswitch class can be performed. The model was evaluated with diagnostics such as precision, recall and f-score. The accuracy of the model was observed to be 94%

How to use it?

All the user will have to do is enter the riboswitch sequence to verify its class.


Palaniappan, Ashok, Bharanikumar, Ramit, & Premkumar, Keshav Aditya R. (2017, October 30). PromoterPredict/PromoterStrengthPredictor: Web server and Standalone dynamic model builder and predictor. Zenodo.