omicverse.metabol.msea_ora#
- omicverse.metabol.msea_ora(hits, background, *, pathways=None, min_size=3, mass_db=None)[source]#
Over-representation analysis via Fisher’s exact test.
- Parameters:
hits (
Iterable[str]) – Metabolite names (e.g. frompyMetabo.significant_metabolites()).background (
Iterable[str]) – All tested metabolite names (the universe). Usuallyadata.var_namesafter filtering.pathways (
Optional[dict[str,list[str]]] (default:None)) – Optional override of{pathway_name: [kegg_id, ...]}. Default is the local KEGG subset shipped with omicverse.min_size (
int(default:3)) – Skip pathways with fewer than this many overlapping background compounds.mass_db (
Optional[DataFrame] (default:None)) – Optional pre-fetched ChEBI DataFrame fromfetch_chebi_compounds(). When supplied,map_idsuses it as an in-memory lookup for the ~54 k ChEBI names and only falls back to PubChem for names not resolved there. On a cold session this turns themap_idscost fromO(n_features)HTTP round-trips into a single dict probe per feature — often a 30–100x speedup on the first call.
- Returns:
Columns:
pathway,overlap,set_size,universe_size,odds_ratio,pvalue,padj(BH).- Return type:
pd.DataFrame