omicverse.metabol.map_ids

Contents

omicverse.metabol.map_ids#

omicverse.metabol.map_ids(names, *, targets=('hmdb', 'kegg', 'chebi'), mass_db=None)[source]#

Resolve metabolite names to external database IDs.

Parameters:
  • names (Iterable[str]) – Iterable of metabolite names (e.g. adata.var_names).

  • targets (tuple[str, ...] (default: ('hmdb', 'kegg', 'chebi'))) – Which external IDs to resolve — any subset of ("hmdb", "kegg", "chebi", "pubchem", "lipidmaps").

  • mass_db (DataFrame | None (default: None)) – Optional pre-fetched ChEBI DataFrame from fetch_chebi_compounds(). When supplied, we look the name up in mass_db["name"] first (instant) and fall back to PubChem only for unresolved names. Recommended for workflows that call map_ids many times in a loop: fetch the DB once and pass it every call to avoid per-name HTTP round-trips.

Returns:

One row per input name, indexed by the original (un-normalized) string, with one column per requested target. Empty string for unresolved targets.

Return type:

pd.DataFrame