RACIPE Formalism
Functions related to RACIPE simulation.
GRN Parsing and ODE Generation Functions
grins.reg_funcs.psH(nod, fld, thr, hill)
Positive Shifted Hill function.
Parameters:
-
nod
(float
) –The node expression value.
-
fld
(float
) –The fold change.
-
thr
(float
) –The half-maximal threshold value.
-
hill
(float
) –The hill coefficient.
Returns:
-
float
–The value of the Positive Shifted Hill function.
Source code in grins/reg_funcs.py
grins.reg_funcs.nsH(nod, fld, thr, hill)
Negative Shifted Hill function.
Parameters:
-
nod
(float
) –The node expression value.
-
fld
(float
) –The fold change.
-
thr
(float
) –The half-maximal threshold value.
-
hill
(float
) –The hill coefficient.
Returns:
-
float
–The value of the Negative Shifted Hill function.
Source code in grins/reg_funcs.py
grins.gen_diffrax_ode.gen_diffrax_odesys(topo_df, topo_name, save_dir='.')
Generate the ODE system code for diffrax based on the given topology dataframe.
Args: topo_df (pandas.DataFrame): The topology dataframe containing the edges information. topo_name (str): The name of the topology. save_dir (str, optional): The directory to save the generated code. Defaults to ".".
Returns: None: Saves the generated file in the driectory specified by save_dir.
Source code in grins/gen_diffrax_ode.py
Parameters and Intial Conditions Generation Functions
grins.gen_params.parse_topos(topofile, save_cleaned=False)
Parse and cleans the given topofile and return the dataframe. It is expeced that the topo files is a tab-separated file with three columns: Source, Target, and Type. White spaces can also be used to separate the columns.
For nodes that are not alphanumeric, the function will replace the non-alphanumeric characters with an underscore and prepend "Node_" if the node name does not start with an alphabet. The cleaned topology file will be saved if the save_cleaned flag is set to True.
Parameters:
-
topofile
(str
) –The path to the topofile.
-
save_cleaned
(bool
, default:False
) –If True, save the cleaned topology file. Defaults to False.
Returns:
-
topo_df
(DataFrame
) –The parsed dataframe.
Source code in grins/gen_params.py
grins.gen_params.gen_param_names(topo_df)
Generate parameter names based on the given topology dataframe.
Parameters:
-
topo_df
(DataFrame
) –The topology dataframe containing the information about the nodes and edges.
Returns:
-
tuple
–A tuple containing the parameter names, unique target node names, and unique source node names.
Source code in grins/gen_params.py
grins.gen_params.sample_distribution(method, dimension, num_points, std_dev=None, optimise=False)
Generates a sample distribution based on the specified method.
Parameters:
-
method
(str
) –The sampling method to use. Options are "Sobol", "LHS", "Uniform", "LogUniform", "Normal", "LogNormal".
-
dimension
(int
) –The number of dimensions for the sample points.
-
num_points
(int
) –The number of sample points to generate.
-
std_dev
(Optional[float]
, default:None
) –The standard deviation for the "Normal" and "LogNormal" distributions. Defaults to 1.0 if not provided.
-
optimise
(bool
, default:False
) –Whether to optimise the sampling process. Applicable for "Sobol" and "LHS" methods.
Returns:
-
ndarray
–An array of sample points generated according to the specified method.
Raises:
-
ValueError
–If an unknown sampling method is specified.
Source code in grins/gen_params.py
grins.gen_params.get_thr_ranges(source_node, topo_df, prange_df, num_params=2 ** 10)
Calculate the median threshold range for a given source node.
Parameters:
-
source_node
(str
) –The source node for which the threshold range is calculated.
-
topo_df
(DataFrame
) –DataFrame containing the topology information.
-
prange_df
(DataFrame
) –DataFrame containing the parameter ranges.
-
num_params
(int
, default:2 ** 10
) –Number of parameters to sample. Defaults to 1024.
Returns:
-
float
–The median threshold range for the given source node.
Source code in grins/gen_params.py
grins.gen_params.gen_param_range_df(topo_df, num_params=2 ** 10, sampling_method='Sobol', thr_rows=True)
Generate a parameter range DataFrame from the topology DataFrame.
Parameters:
-
topo_df
(DataFrame
) –The topology DataFrame containing the network structure.
-
num_params
(int
, default:2 ** 10
) –The number of parameters to generate. Default is 1024.
-
sampling_method
(Union[str, dict]
, default:'Sobol'
) –The sampling method to use. Can be a string specifying a single method for all parameters or a dictionary specifying methods for individual parameters. Default is "Sobol".
-
thr_rows
(bool
, default:True
) –Whether to add threshold rows to the DataFrame. Default is True.
Returns:
-
DataFrame
–A DataFrame containing the parameter ranges and sampling methods.
Source code in grins/gen_params.py
grins.gen_params.gen_param_df(prange_df=None, num_params=2 ** 10, topo_df=None, sampling_method='Sobol', thr_rows=True)
Generate the final parameter DataFrame by sampling parameters. Parameters are grouped by their 'Sampling' (and 'StdDev' if present) to ensure that parameters in the same group follow the same distribution in the higher dimensions. The final DataFrame columns are arranged in the same order as in prange_df. The sampling methods can be: 'Sobol', 'LHS', 'Uniform', 'LogUniform', 'Normal', 'LogNormal'.
Parameters:
-
prange_df
(DataFrame
, default:None
) –DataFrame with columns ["Parameter", "Minimum", "Maximum", "Sampling", ...].
-
num_params
(int
, default:2 ** 10
) –Number of samples to generate per parameter. Default is 1024.
-
topo_df
(DataFrame
, default:None
) –DataFrame containing the network topology information.
-
sampling_method
(Union[str, dict]
, default:'Sobol'
) –The sampling method to use. Can be a string specifying a single method for all parameters or a dictionary specifying methods for individual parameters. Default is "Sobol". The methods can be: 'Sobol', 'LHS', 'Uniform', 'LogUniform', 'Normal', 'LogNormal'.
-
thr_rows
(bool
, default:True
) –Whether to add threshold rows to the DataFrame. Default is True.
Returns:
-
DataFrame
–DataFrame of sampled and scaled parameters.
Source code in grins/gen_params.py
633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 |
|
grins.gen_params.gen_init_cond(topo_df, num_init_conds=2 ** 10)
Generate initial conditions for each node based on the topology.
Parameters:
-
topo_df
(DataFrame
) –DataFrame containing the topology information.
-
num_init_conds
(int
, default:2 ** 10
) –Number of initial conditions to generate. Default is 2**10.
Returns:
-
DataFrame
–DataFrame containing the generated initial conditions for each node.
Source code in grins/gen_params.py
Simulation Related Functions
grins.racipe_run.gen_sim_dirstruct(topo_file, save_dir='.', num_replicates=3)
Generate directory structure for simulation run.
Parameters:
-
topo_file
(str
) –Path to the topo file.
-
save_dir
(str
, default:'.'
) –Directory to save the generated structure. Defaults to ".".
-
num_replicates
(int
, default:3
) –Number of replicates to generate. Defaults to 3.
Returns:
-
None
–Directory structure is created with the topo file name and three folders for the replicates.
Source code in grins/racipe_run.py
grins.racipe_run.gen_topo_param_files(topo_file, save_dir='.', num_replicates=3, num_params=2 ** 10, num_init_conds=2 ** 7, sampling_method='Sobol')
Generate parameter files for simulation.
Parameters:
-
topo_file
(str
) –The path to the topo file.
-
save_dir
(str
, default:'.'
) –The directory where the parameter files will be saved. Defaults to ".".
-
num_params
(int
, default:2 ** 10
) –The number of parameter files to generate. Defaults to 2**10.
-
num_init_conds
(int
, default:2 ** 7
) –The number of initial condition files to generate. Defaults to 2**7.
-
sampling_method
(Union[str, dict]
, default:'Sobol'
) –The method to use for sampling the parameter space. Defaults to 'Sobol'. For a finer control over the parameter generation look at the documentation of the gen_param_range_df function and gen_param_df function.
Returns:
-
None
–The parameter files and initial conditions are generated and saved in the specified replicate directories.
Source code in grins/racipe_run.py
grins.racipe_run.load_odeterm(topo_name, simdir)
Loads an ODE system from a specified topology module and returns an ODETerm object.
Parameters:
-
topo_name
(str
) –The name of the topology module to import.
-
simdir
(str
) –The directory path where the topology module is located.
Returns:
-
ODETerm
–An object representing the ODE system.
Raises:
-
ImportError
–If the specified module cannot be imported.
-
AttributeError
–If the module does not contain an attribute named 'odesys'.
Source code in grins/racipe_run.py
grins.racipe_run.topo_simulate(topo_file, replicate_dir, initial_conditions, parameters, t0=0.0, tmax=200.0, dt0=0.01, tsteps=None, rel_tol=1e-05, abs_tol=1e-06, max_steps=2048, batch_size=10000, ode_term_dir=None)
Simulates the ODE system defined by the topology file and saves the results in the replicate directory. The ode system is loaded as a diffrax ode term and the initial conditions and parameters are passed as jax arrays. The simulation is run for the specified time range and time steps and the results are saved in parquet format in the replicate directory.
Parameters:
-
topo_file
(str
) –Path to the topology file.
-
replicate_dir
(str
) –Directory where the replicate results will be saved.
-
initial_conditions
(DataFrame
) –DataFrame containing the initial conditions.
-
parameters
(DataFrame
) –DataFrame containing the parameters.
-
t0
(float
, default:0.0
) –Initial time for the simulation. Default is 0.0.
-
tmax
(float
, default:200.0
) –Maximum time for the simulation. Default is 100.0.
-
dt0
(float
, default:0.01
) –Initial time step size. Default is 0.1.
-
tsteps
(list
, default:None
) –List of time steps at which to save the results. Default is None.
-
rel_tol
(float
, default:1e-05
) –Relative tolerance for the ODE solver. Default is 1e-5.
-
abs_tol
(float
, default:1e-06
) –Absolute tolerance for the ODE solver. Default is 1e-6.
-
max_steps
(int
, default:2048
) –Maximum number of steps for the ODE solver. Default is 2048.
-
batch_size
(int
, default:10000
) –Batch size for processing combinations of initial conditions and parameters. Default is 10000.
-
ode_term_dir
(str
, default:None
) –Directory where the ODE system file is located. Default is None. If None, the parent directory of the replicate directory is assumed to contain the ODE system file. The ODE system file should be named as the topo file with the .py extension.
Returns:
pd.DataFrame DataFrame containing the solutions of the ODE system.
Source code in grins/racipe_run.py
323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 |
|
grins.racipe_run.gk_normalise_solutions(sol_df, param_df, threshold=1.01, discretize=False)
Normalises the solutions in the solution dataframe using the maximum production rate (G) and degradation rate (k) parameters of the individual nodes in the parameter sets.
Parameters:
-
sol_df
(DataFrame
) –DataFrame containing the solutions with a
ParamNum
column to join with param_df. -
param_df
(DataFrame
) –DataFrame containing the parameters with
Prod_
andDeg_
columns for each node. -
threshold
(float
, default:1.01
) –A threshold value below which the g/k normalised values, if found, will be cliped to 1. Raises an error during discretization if any g/k normalised values exceed this value. Default is 1.01.
-
discretize
(bool
, default:False
) –Whether to discretise the solutions. Default is True.
Returns:
-
DataFrame
–DataFrame containing the normalised and discretised solutions.
-
DataFrame
–DataFrame containing the counts of each state in the normalised solutions.
Example
First, run the simulation then, normalise and discretise the solutions
>>> sol_df, state_counts = gk_normalise_solutions(sol_df, params)
Here, the sol_df
is the solution dataframe and params
is the parameter dataframe.
The state_counts
dataframe contains the counts of each state in the normalised solutions. The returned sol_df
is the normalised and discretised solution dataframe.
Source code in grins/racipe_run.py
grins.racipe_run.discretise_solutions(norm_df, threshold=1.01)
Discretises the solutions in a g/k normalized DataFrame based on histogram peaks and minima.
Parameters:
-
norm_df
(DataFrame
) –DataFrame containing normalized values to be discretized. It should include only the g/k normalized columns, as the presence of other columns may lead to spurious results.
-
threshold
(float
, default:1.01
) –A threshold value below which the g/k normalised values, if found, will be cliped to 1. Raises an error during discretization if any g/k normalised values exceed this value. If the parameter sets are in such a way that the maximum possible expression of the node is not production/degradation, then the threshold value needs to be adjusted accordingly. Default is 1.01.
Returns:
-
Series
–A Series with the name "State" containing the discrete state labels for each row in the input DataFrame. The order of the labels in the state string is the same as the one input column order.
Raises:
-
UserWarning
–If any value in the DataFrame exceeds the specified threshold value.
Example
Given a normalized DataFrame norm_df
, discretise the values
>>> lvl_df = discretise_solutions(norm_df)
The normalized solution DataFrame contains values of the nodes between 0 (lowest) and 1 (highest).
The returned lvl_df
will have discrete state labels for each row in the input DataFrame.
Raises a warning if any node values exceed the threshold value. This can occur when a node starts with a value higher than its g/k ratio and the simulation is stopped before reaching steady state, even though the value is approaching the correct limit.
In time-series simulations with discretization, similar warnings may occur if initial conditions or intermediate values temporarily exceed the g/k threshold. Additionally, it is important to ensure that the time points and solver tolerance settings are appropriately configured, as improper settings can lead to NaN values in the time series.
For steady-state simulations, increasing the solver's relative and absolute tolerances can improve convergence and reduce such warnings by allowing the simulation to more accurately reach the true steady state.
Source code in grins/racipe_run.py
728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 |
|
grins.racipe_run.run_all_replicates(topo_file, save_dir='.', t0=0.0, tmax=200.0, dt0=0.01, tsteps=None, rel_tol=1e-05, abs_tol=1e-06, max_steps=2048, batch_size=10000, normalize=True, discretize=True, gk_threshold=1.01)
Run simulations for all replicates of the specified topo file. The initial conditions and parameters are loaded from the replicate folders. The directory structure is assumed to be the same as that generated by the gen_topo_param_files function, with the main directory with the topo file name which has the parameter range file the ODE system file and the replicate folders with the initial conditions and parameters dataframes.
Parameters:
-
topo_file
(str
) –Path to the topology file.
-
save_dir
(str
, default:'.'
) –Directory where the replicate folders are saved. Defaults to "." i.e current working directory.
-
t0
(float
, default:0.0
) –Initial time for the simulation. Defaults to 0.0.
-
tmax
(float
, default:200.0
) –Maximum time for the simulation. Defaults to 100.0.
-
dt0
(float
, default:0.01
) –Initial time step for the simulation. Defaults to 0.1.
-
tsteps
(int
, default:None
) –Number of time steps for the simulation. Defaults to None.
-
rel_tol
(float
, default:1e-05
) –Relative tolerance for the simulation. Defaults to 1e-5.
-
abs_tol
(float
, default:1e-06
) –Absolute tolerance for the simulation. Defaults to 1e-6.
-
max_steps
(int
, default:2048
) –Maximum number of steps for the simulation. Defaults to 2048.
-
batch_size
(int
, default:10000
) –Batch size for the simulation. Defaults to 1000.
-
normalize
(bool
, default:True
) –Whether to g/k normalise the solutions. Defaults to True.
-
discretize
(bool
, default:True
) –Whether to discretize the solutions. Defaults to True.
-
gk_threshold
(float
, default:1.01
) –A hard threshold value below which the g/k normalised values, if found, will be cliped to 1. Raises an error during discretization if any g/k normalised values exceed this value. Default is 1.01.
Returns:
-
None
–The results of the simulation are saved in the replicate folders in the specified directory.
Note
The results of the simulation are saved in the replicate folders in the specified directory. If the simulation is time series, the results are saved as timeseries_solutions.parquet
and if the simulation is steady state, the results are saved as steadystate_solutions.parquet
.
Normalization and discretization of the solutions are optional features, but note that discretization requires normalization to be enabled.
Behavior based on the discretize
and normalize
flags:
-
If
discretize=True
, the normalized solutions are discretized, and state counts are saved to{topo_name}_steadystate_state_counts_{replicate_base}.csv
. This applies only to steady-state simulations. -
If
discretize=False
, the solutions are normalized but not discretized.
Effect on the final solution DataFrame:
- If only
discretize=True
, aState
column is added. The order of the levels in the state string will be the same as the order of the node columns. - If only
normalize=True
, additional columns are added for each node containing the g/k-normalized values. The column names corresponding to the g/k normalised values will have the format "gk_{node_name}". - If both flags are set to
True
, both thestate
column and the normalized value columns are included.
Example
Run the simulation for the specified topo file
>>> run_all_replicates(topo_file, save_dir, t0, tmax, dt0, tsteps, rel_tol, abs_tol, max_steps, batch_size)
Source code in grins/racipe_run.py
502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 |
|