omicverse.utils.load_metabolights#
- omicverse.utils.load_metabolights(study_id, *, group_col=None, cache_dir='metabolights_cache', maf_name=None, sample_name_col='Sample Name', refresh=False)[source]#
Load a Metabolights study into a samples × metabolites AnnData.
- Parameters:
study_id (
str) – Metabolights accession, e.g."MTBLS1". The study must be under the public mirror atftp.ebi.ac.uk/pub/databases/ metabolights/studies/public.group_col (
Optional[str] (default:None)) – Column in the sample sheet to use as the primary phenotype label. When given, the column is renamed to"group"inadata.obsto match the convention the rest ofov.metabolexpects (differential(group_col="group"),roc_feature(group_col="group"), …). Common choices:"Factor Value[<name>]".cache_dir (
str|Path(default:'metabolights_cache')) – Directory to cache downloaded files. Default./metabolights_cache/. Re-runs reuse cached files unlessrefresh=True.maf_name (
Optional[str] (default:None)) – Explicit MAF filename. Default: first alphabeticalm_*_maf.tsvin the directory listing. Override when a study ships multiple MAFs (e.g. positive vs. negative mode).sample_name_col (
str(default:'Sample Name')) – Column in the sample sheet carrying the assay-side sample identifiers. Default"Sample Name"— works for every study that follows the ISA-Tab standard.refresh (
bool(default:False)) – Force re-download even if the cached file exists. Use when Metabolights updates a study version in place.
- Returns:
obscarries every column of the sample sheet plus a derivedgroupcolumn (ifgroup_colwas supplied).varcarriesmetabolite_identification(filled withunknown_shift_<ppm>for NMR rows that lack a named identification) pluschemical_formulaandsmileswhen available.uns['metabolights'] = {'study_id', 'maf_name', 'sample_sheet'}records provenance.- Return type:
AnnData