Integrated Human Practices
Machine Learning in Synthetic Biology
While we built our Machine Learning based software tool, it dawned on us the kind of impact machine learning could have on synthetic biology and the data in the registry of biological part. Ever since the inception of iGEM ,the registry been accumulating charechterisation data about different parts and now we are at a point in history where we can use powerful machine learning based tools and leverage the data available in the registry and other biological databases to build tools that can make powerflul inferences or aid synthetic biologists in their lab.
Some of the safety measures that we follow are listed below :
- The registry of standard biological parts has been growing with large amounts of data every year, which has led to an abundance of certain types of parts (promoters, RBSs, chromoproteins, quorum sensing modules etc.): (Y/N)
- If a new part is being added to the Registry, say a new promoter and if the registry has enough data about promoters to build a machine learning routine that is capable of characterizing the promoter without the need for tedious experiments in the lab, with what level of alacrity would you accept such a tool?
- In the future, about 10 years from now, when the data available in the Registry has increased exponentially leading to teams developing highly reliable tools that can characterize newly added parts based on training a machine learning tool on existing parts, do you think iGEM/the registry should make such routines available through the registry as a reliable source of characterizing newly added parts? Also if you think they should, let us know about what kind of an impact this would have on the iGEM community.
Conclusion From Responses To Survey
‘I think an algorithm can never replace a biological experiment as the ultimate proof (which does not mean I wouldn’t use a ML approach to get a first guess about a sequence’s function)’ – Team Heidelberg 2017.
- All teams generally agreed there has been a significant amount of data available in the registry now that some rudimentary machine learning algorithms can be applied it.
- However they also asserted that they would only use it if the models correlated really well with the experimental data.
- They believe that if a really useful tool exists then the registry could possible list it so that teams can use it more easily.
- However they believe that if a team has a built a tool with predictive capabilities then they will not use it as the final node of inference and would still carry out biological experiments.
- Hence, it can be concluded that teams are looking forward and are already applying machine learning to data in the registry and are very welcoming towards it. However, they consider that Machine learning like modelling can only aid teams with their experimental results and not be used as a sole method to understand the function or characterize the part.
Engineering the pH riboswitch to have a BioBrick scar
We observed that the wild type pH riboswitch had a 4 bp scar that was different from the 6 bp biobrick scar. While teams have generally dealt with a different scar by simply replacing it with the biobrick one and making appropriate changes that base pairing was observed to preserve the RNA secondary structure. To make it compatible with the biobrick standard, we had an additional contingency in that the number of base pairs were different for the two scars. While we tried engineering the riboswitch to have the biobrick scar site , we failed in engineering them to have the same secondary structure and gibbs free energy as the natural wild type riboswitch.
We then reached out to multiple professors who have worked with pH regulatory elements and explained our project and our problem with engineering the pH riboswitch to accommodate the BioBrick scar.
Prof. Gal Nechooshtan at the Cold Spring Harbor lab provided us with critical knowledge on how mutations affect the pH riboswitch and how we can accommodate a scar site that’s different in size to the natural one by following certain rules pertaining to RNA secondary structure design principle. He also provided us with appropriate resources for us to read so that we learnt how to engineer the riboswitch to have the biobrick scar and still contained the same secondary structure and Gibbs free energy.
Teams over the years while trying to engage and educate the public about synthetic biology have sometimes only imparted an abstract overview of the subject we wanted to go a step beyond where an individual who did not know much in Biology learnt a lot about synthetic biology.
For this we designed a biohackathon – a hackathon where students had to build a tool that incorporated synthetic biology in some way. A fornight prior to the hackathon we began a rigorous two-week workshop for computer science students and taught them the basics of synthetic biology and molecular biology. The first week we focused on teaching them the basics of molecular biology such as the basics of DNA, transcription, translation and operon theory. The second week we taught them the basics of synthetic biology such as the biobrick standard, 3A assembly, the role of fluorescent proteins and how synthetic biology is the driving force of standardization in Biology.
A total of 7 teams participated in the 48 hour biohackathon, where the first two teams were given certificates and a cash price. The top two teams built two really good tools. The team from IIT Madras won the first place for building a software based on codon optimization for biobrick assembly and the second place was won by a team from the computer science department from SVCE who created a game based on biobrick assembly.