Estimating the ratio of two probability densities from finitely many samples, is a central task in machine learning and statistics. In this work, we show that a large class of kernel methods for density ratio estimation suffers from error saturation, which prevents algorithms from achieving fast error convergence rates on highly regular learning problems. To resolve saturation, we introduce iterated regularization in density ratio estimation to achieve fast error rates. Our methods outperform its non-iteratively regularized versions on benchmarks for density ratio estimation as well as on large-scale evaluations for importance-weighted ensembling of deep unsupervised domain adaptation models.
翻译:从有限样本中估计两个概率密度的比率是机器学习和统计学中的核心任务。在本研究中,我们证明一类用于密度比率估计的核方法存在误差饱和问题,这阻碍了算法在高度正则的学习问题上实现快速误差收敛速率。为解决饱和问题,我们在密度比率估计中引入迭代正则化以实现快速误差速率。我们的方法在密度比率估计基准测试以及深度无监督领域自适应模型的重要性加权集成的大规模评估中,均优于非迭代正则化版本。