Elucidating the Design Space of Generative Models for Single-Cell Perturbation Prediction
Abstract
We introduce ExpressionVAE, the first discrete-latent perturbation model for single-cell data: a scalar-quantized variational autoencoder paired with a perturbation-conditioned generative prior.
On Replogle and Parse~1M multiple variations of this framework achieves state-of-the-art on every distributional and all cell-eval derived generation and perturbation metrics we evaluate, with order-of-magnitude gaps on Frechet distance and MMD^2
over the strongest continuous-latent baselines. We test two prior families (autoregressive and masked discrete diffusion) and find they achieve effectively identical numbers,
isolating the gain to the discrete latent space. A controlled output-head ablation further reveals a single design axis governing decoder-head choice, the richness of the inference-time sampling distribution,
with standard evaluation metrics partitioning into three groups whose rankings flip along it. Finally, on a held-out CRISPRi reversion benchmark of 1732 perturbations under inflammatory cytokine stress,
the frozen encoder outperforms existing methods like UMAP & DE and matches the the scGPT model (trained on 10 times larger dataset) on target ranking.
Approach
evae factorizes perturbation prediction into two stages: (1) a FSQ-VAE encoder that maps gene expression profiles into discrete or continuous latent codes, and (2) a generative prior trained on those codes to predict how a cell's latent representation shifts under perturbation. We perform a controlled cross-product study over two axes:
Prior (generative model)
- Autoregressive (AR)
- Masked diffusion MDLM
- Flow matching
Output Head (tokenization)
- Cross-entropy / quantile (
ce-quantile) - Hurdle model
- MSE
- Negative binomial (
nb)
We also vary quantizers (FSQ, Gaussian/continuous) across two large-scale benchmarks. The output head choice is analogous to the "output tokenization" decision in LLM design and proves to be the dominant architectural variable.