omicverse.micro.MMvec#
- omicverse.micro.MMvec(n_latent: int = 3, lr: float = 0.05, epochs: int = 1000, val_frac: float = 0.1, patience: int = 100, l2: float = 0.001, seed: int = 0, device: str | None = None)[source]#
MMvec (Morton et al. 2019) in ~80 lines of PyTorch.
The objective is the exact expected multinomial log-likelihood
\[\ell \;=\; \sum_{i,j} W_{ij} \,\log \mathrm{softmax}(u_i \cdot V^\top + \beta)_j\]where \(W_{ij} = \sum_s c_{s,i} \cdot m_{s,j} / M_s\) is the co-occurrence weight matrix (total microbe-i count × expected metabolite-j fraction over the cohort). For the tutorial-scale data we use the full softmax; the upstream
mmvecpackage uses negative sampling to scale to thousands of features.- Parameters:
n_latent (default:
3) – Embedding dimensionalityK.lr (default:
0.05) – Adam learning rate.epochs (default:
1000) – Maximum training epochs.val_frac (default:
0.1) – Fraction of samples held out for the validation loss curve / early stopping. Set to 0 to skip validation.patience (default:
100) – Early-stopping patience on validation loss (epochs without improvement before training halts).l2 (default:
0.001) – Weight-decay onU/V/beta.seed (default:
0) – Torch RNG seed.device (default:
None) –'cpu'/'cuda'/None(auto-pick based on availability).