omicverse.metabol.msea_gsea#
- omicverse.metabol.msea_gsea(deg, *, stat_col='stat', pathways=None, n_perm=1000, min_size=3, max_size=500, seed=0, mass_db=None)[source]#
GSEA-style ranked enrichment via
gseapy.prerank.- Parameters:
deg (
DataFrame) – Output DataFrame fromdifferential(). Rows indexed by metabolite name; columnstat_colprovides the ranking metric.stat_col (
str(default:'stat')) – Which column ofdegto rank on. Default"stat"(signed t-statistic);"log2fc"is another common choice.pathways (
Optional[dict[str,list[str]]] (default:None)) – Dict mapping pathway name to list of KEGG compound IDs.n_perm (
int(default:1000)) – Permutation count for the empirical null. 1000 is fine for tutorials; bump to ≥10000 for publication.mass_db (
Optional[DataFrame] (default:None)) – Optional pre-fetched ChEBI DataFrame fromfetch_chebi_compounds()— same role as inmsea_ora(). Recommended for cold-cache runs to avoid per-name PubChem REST round-trips.
- Returns:
Columns:
Term,NES,NOM p-val,FDR q-val,ES,Lead_genes(metabolites driving the enrichment).- Return type:
pd.DataFrame