Quantifying the difference between two probability density functions, $p$ and $q$, using available data, is a fundamental problem in Statistics and Machine Learning. A usual approach for addressing this problem is the likelihood-ratio estimation (LRE) between $p$ and $q$, which -- to our best knowledge -- has been investigated mainly for the offline case. This paper contributes by introducing a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t \sim p, x'_t \sim q)$ are observed over time. The non-parametric nature of our approach has the advantage of being agnostic to the forms of $p$ and $q$. Moreover, we capitalize on the recent advances in Kernel Methods and functional minimization to develop an estimator that can be efficiently updated online. We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
翻译:量化两个概率密度函数 $p$ 和 $q$ 之间的差异,并利用现有数据进行估计,是统计学与机器学习领域的一个基本问题。解决该问题的常用方法是估计 $p$ 与 $q$ 之间的似然比(Likelihood-Ratio Estimation, LRE),据我们所知,现有研究主要针对离线情形展开。本文提出一种适用于在线场景的新型非参数似然比估计框架(Online LRE, OLRE),在该场景中,随时间推移观测到独立同分布数据对 $(x_t \sim p, x'_t \sim q)$。本方法的非参数特性使其对 $p$ 和 $q$ 的具体形式具有不可知性优势。此外,我们利用核方法(Kernel Methods)与函数最小化领域的最新进展,开发了一种可高效在线更新的估计器。我们为 OLRE 方法提供了理论性能保证,并通过合成实验进行了实证验证。