Assuming we have iid observations from two unknown probability density functions (pdfs), $p$ and $q$, the likelihood-ratio estimation (LRE) is an elegant approach to compare the two pdfs only by relying on the available data. In this paper, we introduce the first -to the best of our knowledge-graph-based extension of this problem, which reads as follows: Suppose each node $v$ of a fixed graph has access to observations coming from two unknown node-specific pdfs, $p_v$ and $q_v$, and the goal is to estimate for each node the likelihood-ratio between both pdfs by also taking into account the information provided by the graph structure. The node-level estimation tasks are supposed to exhibit similarities conveyed by the graph, which suggests that the nodes could collaborate to solve them more efficiently. We develop this idea in a concrete non-parametric method that we call Graph-based Relative Unconstrained Least-squares Importance Fitting (GRULSIF). We derive convergence rates for our collaborative approach that highlights the role played by variables such as the number of available observations per node, the size of the graph, and how accurately the graph structure encodes the similarity between tasks. These theoretical results explicit the situations where collaborative estimation effectively leads to an improvement in performance compared to solving each problem independently. Finally, in a series of experiments, we illustrate how GRULSIF infers the likelihood-ratios at the nodes of the graph more accurately compared to state-of-the art LRE methods, which would operate independently at each node, and we also verify that the behavior of GRULSIF is aligned with our previous theoretical analysis.
翻译:假设我们拥有来自两个未知概率密度函数(pdf)$p$和$q$的独立同分布观测数据,似然比估计(LRE)是一种仅依赖可用数据来比较这两个pdf的优雅方法。本文首次提出(据我们所知)该问题的基于图的扩展,具体描述如下:假设固定图中的每个节点$v$能够获取来自两个未知节点特定pdf $p_v$和$q_v$的观测数据,目标是通过同时利用图结构提供的信息,为每个节点估计这两个pdf之间的似然比。这些节点级的估计任务预计会展现由图传递的相似性,这表明节点可以通过协作更高效地解决它们。我们在一具体非参数方法中实现了这一思想,称为基于图的相对无约束最小二乘重要性拟合(GRULSIF)。我们推导了协作方法的收敛速率,重点揭示了每节点可用观测数量、图的规模以及图结构编码任务间相似性的精确度等变量所发挥的作用。这些理论结果明确了协同估计相较于独立解决每个问题可带来性能提升的具体情形。最后,通过一系列实验,我们展示了GRULSIF相较于在各节点独立运行的现有最先进LRE方法能更准确地推断图中各节点的似然比,并验证了GRULSIF的行为与我们之前的理论分析一致。