Estimating density ratios between pairs of intractable data distributions is a core problem in probabilistic modeling, enabling principled comparisons of sample likelihoods under different data-generating processes across conditions and covariates. While exact-likelihood models such as normalizing flows offer a promising approach to density ratio estimation, naive flow-based evaluations are computationally expensive, as they require simulating costly likelihood integrals for each distribution separately. In this work, we leverage condition-aware flow matching to derive a single dynamical formulation for tracking density ratios along generative trajectories. We demonstrate competitive performance on simulated benchmarks for closed-form ratio estimation, and show that our method supports versatile tasks in single-cell genomics data analysis, where likelihood-based comparisons of cellular states across experimental conditions enable treatment effect estimation and batch correction evaluation.
翻译:估计难处理数据分布对之间的密度比是概率建模的核心问题,其能够在不同条件和协变量下,对不同数据生成过程的样本似然进行原则性比较。虽然精确似然模型(如归一化流)为密度比估计提供了有前景的途径,但基于流的朴素评估方法计算成本高昂,因为需要为每个分布单独计算昂贵的似然积分。本研究利用条件感知流匹配技术,推导出沿生成轨迹追踪密度比的单一动力学公式。我们在闭式比率估计的模拟基准测试中展示了具有竞争力的性能,并证明该方法支持单细胞基因组学数据分析中的多种任务,其中基于似然的跨实验条件细胞状态比较可实现处理效应估计和批次校正评估。