Recently, neural networks have produced state-of-the-art results for density-ratio estimation (DRE), a fundamental technique in machine learning. However, existing methods bear optimization issues that arise from the loss functions of DRE: a large sample requirement of Kullback--Leibler (KL)-divergence, vanishing of train loss gradients, and biased gradients of the loss functions. Thus, an $\alpha$-divergence loss function ($\alpha$-Div) that offers concise implementation and stable optimization is proposed in this paper. Furthermore, technical justifications for the proposed loss function are presented. The stability of the proposed loss function is empirically demonstrated and the estimation accuracy of DRE tasks is investigated. Additionally, this study presents a sample requirement for DRE using the proposed loss function in terms of the upper bound of $L_1$ error, which connects a curse of dimensionality as a common problem in high-dimensional DRE tasks.
翻译:近年来,神经网络在密度比估计(DRE)——机器学习中的一项基础技术——上取得了最先进的结果。然而,现有方法存在由DRE损失函数引发的优化问题:Kullback-Leibler(KL)散度的大样本需求、训练损失梯度的消失以及损失函数的梯度偏差。为此,本文提出了一种$α$-散度损失函数($α$-Div),其实现简洁且优化稳定。此外,文中给出了所提损失函数的技术论证,通过实验验证了其稳定性,并探讨了DRE任务的估计精度。同时,本研究从$L_1$误差上界的角度,给出了使用所提损失函数的DRE样本需求,这一需求与高维DRE任务中常见的维数灾难问题相关联。