Team:TU-Eindhoven/Model/Rule based model



Introduction of NFsim

As already mentioned, traditional stochastic or ODE-solver simulation methods require every possible reaction to be defined, with all possible molecular states being named. The valences of the constructs that we are interested in, have the chance to form a very complex and connected network. It would be impossible to define this network beforehand. Since more systems are too complex to simulate with the traditional stochastic or ODE-solver simulation methods, researchers have begun to develop another modeling language, for example the BioNetGen Language (BNGL). The main advantage of BNGL is that the specification of reaction rules is required, but not the specification of all reactions. To make this possible, the molecules need to be specified in another way, namely with the different components and states that they can be in. The components can form a bond with components of another molecule and the states can change, which may be required for binding with another molecule. In short, rule-based-modelling requires the definition of rules, and the computer will figure out the rest.[1]

The BNGL can be applied using a Network-Free Stochastic Simulator (NFsim). NFsim is especially useful for the simulation of systems with large reaction networks and a high degree of combinatorial complexity. It keeps track of the state of the system that actually exists and does not consider every possible configuration.[1]

We also developed a software tool for NFsim, so if you want to try to model a system yourself, we refer you to our "Software" page.

Defining the rules

After installing NFsim on Windows (with the addition of Active Perl), Mac, or Linux using the NFsim Manual, we made a BNGL file with rules matching to our designed system. We started with defining the parameters, which include the association and dissociation rates of the defined parts. The next section is where we defined the molecule types, which means that we define the different parts/constructs. We did it as follows (visualized in Figure 1):
  • The Scaffold Constructs with its three binding pocket, where each pocket can have two different states, one where the pocket is empty and one where the inducer is bound.
  • The Binding Partner, which has one site for binding to one scaffold pocket and one site for binding to the Center Point.
  • The Center Point with four binding sites, where each site can bind to a different Binding Partner.
  • The inducer, without any binding sites or states.

The next step is the definition of the species, which means that you can define the initial amount of the already defined molecule types. It is not sufficient to only mention the molecule type, as you also need to define the state of the pockets. We choose to start with only empty pockets of the Scaffold Construct.
In the section of the observables, you define a molecule with a name and then define the molecule that you want to know the amount of. This can be in a specific state of the molecule (for example where the pockets are filled with the inducer) or more general, being the total amount of the molecule (not dependent on the state of its pockets or bonds with other molecules). It is also possible to define a bond between two molecules and count the amount of the bonds. At the end of the simulation, the output will be a .gdat file including the amount of the observables over the time-steps.
The last section is the section where the whole simulation is based on: reaction rules, where you define all the possible reactions and give them a value as defined with the parameters. Our reactions rules include:
  • The binding of one of the Center Point pockets with a Binding Partner.
  • The binding of an inducer to the pocket of the Scaffold Construct.
  • The binding of a Binding Partner with a pocket of the Scaffold Construct, of which the pocket already needs to have an inducer bound to it.
  • And for each of the above also the reverse reaction, as the system is dynamic. This can be defined in the same reaction equation by simply using <-> instead of ->.

The BNGL-file ends with defining the simulation (with its end-time and number of steps) that has to be executed. Additionally, by adding one extra line, you can let the program generate a xml-file, which is useful for running the simulation in another way. This gives the option to generate even more output-files. The last two lines of the BNGL-file can look like:
  • simulate_nf({suffix=>nf,t_end=>…,n_steps=>…,get_final_state=>1});
  • writeXML();

The output generated of the final state is a .species-file and shows all the molecules in their final state. This can be, in the case of a very large network, a very long name for one molecule, as there are many bonds that connect the different molecules with each other, leading to one large molecule definition.
Other output files that are generated are an additional xml-file and one file containing the data (.gdat) of the observables.
The xml-file can be used to run the simulation in another way, with additional flags. The "dump" flag is very useful if complex formation is an important result. The result of the files generated with dump can be analyzed easily with the Matlab functions provided by NFsim. We wrote our own script to use the provided functions and extract and visualize the desired data.

The results of the model can be found in our "Model Results" section.

[1] M.W. Sneddon, J.R. Faeder and T. Emonet, "Efficient modeling, simulation and coarse-graining of biological complexity with NFsim", Nature Methods, vol. 8, no. 2, pp 177-183, 2011.