omicverse.metabol.fetch_chebi_compounds#
- omicverse.metabol.fetch_chebi_compounds(*, cache=True, refresh=False)[source]#
Build a compound master table from ChEBI’s flat-file TSVs.
Downloads + joins three ChEBI distributions from the public EBI FTP (over HTTPS):
compounds.tsv.gz— ChEBI ID → canonical namechemical_data.tsv.gz— monoisotopic mass + formuladatabase_accession.tsv.gz— HMDB / KEGG / LipidMaps xrefs
Total download is ~15 MB; the joined parquet cache persists at
~/.cache/omicverse/metabol/chebi_compounds.parquet. This is the substrateannotate_peaks()uses for mummichog mass matching.- Returns:
Columns:
chebi_id,name,formula,mw(monoisotopic, float),kegg,hmdb,lipidmaps. Rows without a monoisotopic mass are dropped.- Return type:
pd.DataFrame