gretapy.tl.benchmark#
- gretapy.tl.benchmark(grns, organism=None, datasets=None, terms=None, metrics=None, min_edges=5, verbose=True)#
Run the benchmark for one or multiple GRNs across one or multiple datasets.
- Parameters:
grns (
dict) – Dictionary mapping GRN names to per-organism per-dataset GRN DataFrames. Structure:{grn_name: {organism: {dataset_name: DataFrame}}}.organism (
str|None(default:None)) – Ignored when organism keys are present ingrns. Kept for clarity but organisms are inferred from the second level ofgrns.datasets (
list|dict|None(default:None)) – Dataset(s) to evaluate against. Can be: - None: Use all datasets present in the grns dict for each organism. - list: A whitelist of dataset names (applied across all organisms). - dict: A flat dictionary mapping dataset names to pre-loaded MuData/AnnData objects.terms (
dict|None(default:None)) – Optional dictionary specifying filtering terms per organism, dataset, and metric. Structure:{organism: {dataset_name: {db_name: [terms]}}}. If None, terms are auto-loaded from config for each dataset.metrics (
str|list|None(default:None)) – Metric(s) to evaluate. Can be category name, metric type, or database name. If None, all available metrics are evaluated.min_edges (
int(default:5)) – Minimum number of edges required in a GRN to run evaluation.verbose (
bool(default:True)) – Whether to log progress messages and show progress bars.
- Return type:
- Returns:
DataFrame with columns: grn, organism, dataset, class, task, db, precision, recall, f01.
Example
import gretapy as gt import pandas as pd # Multi-organism GRNs grns = { "method_a": { "hg38": { "PBMC": pd.read_csv("grn_a_pbmc.csv"), "Lung": pd.read_csv("grn_a_lung.csv"), }, "mm10": { "Palate": pd.read_csv("grn_a_palate.csv"), }, }, "method_b": { "hg38": { "PBMC": pd.read_csv("grn_b_pbmc.csv"), }, }, } results = gt.tl.benchmark(grns=grns) # With pre-loaded datasets results = gt.tl.benchmark( grns=grns, datasets={"PBMC": mudata_obj, "Lung": mudata_obj2}, )