Recently, neural networks have produced state-of-the-art results for density-ratio estimation (DRE), a fundamental technique in machine learning. However, existing methods bear optimization issues that arise from the loss functions of DRE: a large sample requirement of Kullback--Leibler (KL)-divergence, vanishing of train loss gradients, and biased gradients of the loss functions. Thus, an $\alpha$-divergence loss function ($\alpha$-Div) that offers concise implementation and stable optimization is proposed in this paper. Furthermore, technical justifications for the proposed loss function are presented. The stability of the proposed loss function is empirically demonstrated and the estimation accuracy of DRE tasks is investigated. Additionally, this study presents a sample requirement for DRE using the proposed loss function in terms of the upper bound of $L_1$ error, which connects a curse of dimensionality as a common problem in high-dimensional DRE tasks.
翻译:近年来,神经网络在密度比估计(DRE)这一机器学习基础技术中取得了最先进成果。然而,现有方法存在由DRE损失函数引发的优化问题:Kullback-Leibler(KL)散度的大样本需求、训练损失梯度的消失,以及损失函数的梯度偏差。为此,本文提出了一种$α$-散度损失函数($α$-Div),该函数兼具简洁实现与稳定优化特性。进一步地,本文给出了所提损失函数的技术证明,并通过实验验证了其稳定性及DRE任务的估计精度。此外,本研究从$L_1$误差上界角度给出了采用所提损失函数时DRE的样本需求,揭示了高维DRE任务中普遍存在的维度灾难问题。