gretapy.tl.cre_to_tss_distance

gretapy.tl.cre_to_tss_distance#

gretapy.tl.cre_to_tss_distance(grns, organism='hg38')#

Compute the distance from each CRE to the TSS of its target gene.

Parameters:
  • grns (DataFrame | dict) – Single GRN DataFrame or dictionary of GRNs with names as keys. Each DataFrame must have at least cre and target columns, with CREs formatted as "chrX-start-end".

  • organism (str (default: 'hg38')) – Organism identifier (e.g., "hg38"). Used to load the Promoters database via gretapy.ds.read_db().

Return type:

DataFrame

Returns:

pd.DataFrame DataFrame with columns grn, cre, target, and distance. One row per unique (grn, cre, target) combination whose target gene is present in the Promoters database. Distance is the absolute number of base pairs between the closest CRE edge and the nearest edge of the 2000 bp promoter window. CREs that overlap the promoter window have distance 0.