In synthetic biology, we are often introducing new pathways to bacteria that do not naturally express them, and sometimes even creates new proteins that don't exist before. The novel pathway will produce exotic enzymes and proteins which the host bacteria will not necessarily have the internal environment in order to organise these macromolecular products. Ultimately, this could be detrimental to the performance of both the pathway and the organism itself. Additionally, depending on the organism used, the activity of the pathway can vary and be difficult to characterise against other models used. Thus, we aim to standardize the microenvironmental activity of different pathways within the cell by localising the associated enzymes/proteins in an RNA based structure, leading to the pathway to act in a predictable way, regardless of the organism.
Fig. 1: A.) Representation of single-stranded RNA that forms the liquid-liquid phase separated-like structure. B.) Images of RNA aggregates formed inside the nucleus of different mutant mammalian cells that express RNA containing either the repeats CUG or CAG
In mammalian cells, RNA containing triplet repeats of nucleotides such as CAGCAGCAGCAG have been observed to aggregate in the nucleus. The properties of the RNA aggregation is similar to those seen in liquid-liquid phase separated molecules, which can be visualized as oil droplets in water. The densely compact RNA strands will allow small molecules or substrates to pass through the structure while maintaining a different internal environment. Using this idea, we aimed to express RNA containing repeats in bacterial cells, in order to develop an intracellular scaffold. The scaffold will act as the basis of a synthetic organelle, which could be used in both the standardization of pathway activity in vivo and in cell-free systems as a specialised organelle.
In Vivo Scaffolding
The modelling of the organelle’s kinetics ( see details for the modeling
) showed that when the rate of reaction (k) of A and B composition reaction is low, the homogenous mixture would have a higher rate of production for this non-specific reaction. Although, if k is at a high rate and more specific, the cells with organelles would have an higher rate of production for the reaction of A and B over the situation without it. The model indicates that the organelle prefers higher-constant, specific reactions.
Fig. 2. The whole-cell production rate with or without organelle, versus the reaction rate constant of A and B. The model shows the expected performance of the organelle at different constant level (k).
Fig. 3: A) A schematic of the construct used to generate RNA aggregates that contain MS2 aptamers. B) The non-specific binding of FA and FB will form a complete GFP. C) The produced FA with MS2 will be in competition to either bind with FB or the RNA aggregation.
FA-FB SystemThe FA-FB system is a visualisation method that uses two components of a split GFP, FA and FB, which can weakly bind together to form a complete GFP. This technology is used to study the protein-protein interaction, or protein-RNA interaction by BiFC. However, the background signal is terribly high in bacteria due to the non-specific binding and leads artifacts even in plants and mammalian cell experiments. We choose a specific version of FA FB (sequence available in part registry) for the testing. They don't bind to each other spontaneously but usually have a high background. For protein-RNA interaction study, the two components are designed to be linked to specific aptamers which bind to their respective domains, allowing the GFP to be reformed at the site of interest. The FA component is expressed an MS2 aptamer, which is a specific bacteriophage binding site that connects an MS2 binding domain. By expressing the CAG repeat sequence with an MS2 binding domain (fig. 2A), specific binding between RNA organelle and FA-MS2 fusion protein can occur on the RNA aggregation. Thus, this creates a localization of FA in the cell.
In Vivo Results
Fig. 4: RNA organelle abolish the non-specific binding signals between FA and FB split GFP fragments caused by the overexpression, and has higher advantages when the synthesis rate is high, confirming our theoretical analysis ( see details here) : from left to right: flow cytometry results (>50000 cells not gated), microscope images, and the histogram of signals in the same image, with or without the presence of RNA organelle; from top to down: high concentration (full induction) of inducer to no inducer. Strain and construct: BL21AI, FA and FB are under T7lac promoters, and 49xCAG-12xMS2 RNA repeats is under pLtetO-1 promoter, copy numbers are below 50.
Fig. 5: A) GFP expression in overnight culture. B) Labelled areas of fluorescent intensity of a minimum of 3000 pixel value. C) The charted intensity of the regions of interest labelled in B
Fig. 6: A) GFP expression in 20h culture. B) Labelled areas of fluorescent intensity of a minimum of 3000 pixel value. C) The charted intensity of the regions of interest labelled in B
Fig 4 shows that as the expression of FA and FB are increased through arabinose induction, we see an increase in the overall GFP signal, which indicates the performance of the non-specific binding of the FA-FB. For the cultures that were also induced with the formation of the RNA organelle with the FA and FB components, we can see that there is a much lower background signal. This could be explained by the low affinity of FA and FB aggregation. When RNA organelle rapidly recruits all the FA, FB in the cytoplasm is unable to find its substrate. As predicted by our model, the RNA organelle removes non-specific reactions in synthetic systems. Besides, we could also prove that FA, FB binding reaction still occur, though in a very low rate, to make sure the experiments are valid. Because the GFP complementation is an irreversible reaction, and the MS2 coat protein bound components cannot diffuse away from the RNA organelles, we expect to see, in a long-run experiment, the GFP signals accumulate inside the RNA organelle. Fig. 5 and Fig. 6 shows the fluorescences of RNA organelles at exponential phase (3~4 hours after induction) and in late stationary phase (20 hours) respectively, confirming it. This is also a strong evidence to show that the aggregations are long-living as they can still be seen after many hours. The data support the concept that the synthetic organelle may be used in a cell-free system.
Synthesis of CAG Repeats
Fig. 7: A) The resulting product of the repeat synthesis from random assembly of 10xCAG and 10xCTG nucleotides at initial concentrations of 0μl, 2μl, 3μl and 4μl. B) A schematic representation of the random assembly of the 10xCAG and 10xCTG sequences.
A collection of repeat sequences was built using two oligonucleotides: 10xCAG and 10xCTG. Through testing a variety of oligonucleotide concentrations and testing a different of PCR conditions, a specific protocol was built to synthesise various lenths of DNA containing CAG repeats. The resulting product appeared as a smear, indicating a range of lengths was created, the product was subsequently transformed into a T7 containing vector in order to produce RNA containing CAG repeats.