This paper focuses on the problem of unbounded density ratio estimation -- an understudied yet critical challenge in statistical learning -- and its application to covariate shift adaptation. Much of the existing literature assumes that the density ratio is either uniformly bounded or unbounded but known exactly. These conditions are often violated in practice, creating a gap between theoretical guarantees and real-world applicability. In contrast, this work directly addresses unbounded density ratios and integrates them into importance weighting for effective covariate shift adaptation. We propose a three-step estimation method that leverages unlabeled data from both the source and target distributions: (1) estimating a relative density ratio; (2) applying a truncation operation to control its unboundedness; and (3) transforming the truncated estimate back into the standard density ratio. The estimated density ratio is then employed as importance weights for regression under covariate shift. We establish rigorous, non-asymptotic convergence guarantees for both the proposed density ratio estimator and the resulting regression function estimator, demonstrating optimal or near-optimal convergence rates. Our findings offer new theoretical insights into density ratio estimation and learning under covariate shift, extending classical learning theory to more practical and challenging scenarios.
翻译:本文聚焦于无界密度比估计问题——统计学学习中一个研究不足但至关重要的挑战——及其在协变量偏移自适应中的应用。现有文献大多假设密度比要么一致有界,要么无界但精确已知,这些条件在实践中常被违反,造成了理论保证与实际可用性之间的差距。相比之下,本工作直接处理无界密度比,并将其融入重要性加权以实现有效的协变量偏移自适应。我们提出了一种三步估计方法,利用来自源分布和目标分布的未标记数据:(1)估计相对密度比;(2)应用截断操作以控制其无界性;(3)将截断估计量变换回标准密度比。所估计的密度比随后被用作协变量偏移下回归问题的重要性权重。我们为所提出的密度比估计量及由此产生的回归函数估计量建立了严格的非渐近收敛保证,证明了最优或接近最优的收敛速度。我们的研究结果为密度比估计和协变量偏移下的学习提供了新的理论洞见,将经典学习理论拓展至更实际且更具挑战性的场景。