Consider the nonparametric logistic regression problem. In the logistic regression, we usually consider the maximum likelihood estimator, and the excess risk is the expectation of the Kullback-Leibler (KL) divergence between the true and estimated conditional class probabilities. However, in the nonparametric logistic regression, the KL divergence could diverge easily, and thus, the convergence of the excess risk is difficult to prove or does not hold. Several existing studies show the convergence of the KL divergence under strong assumptions. In most cases, our goal is to estimate the true conditional class probabilities. Thus, instead of analyzing the excess risk itself, it suffices to show the consistency of the maximum likelihood estimator in some suitable metric. In this paper, using a simple unified approach for analyzing the nonparametric maximum likelihood estimator (NPMLE), we directly derive the convergence rates of the NPMLE in the Hellinger distance under mild assumptions. Although our results are similar to the results in some existing studies, we provide simple and more direct proofs for these results. As an important application, we derive the convergence rates of the NPMLE with deep neural networks and show that the derived rate nearly achieves the minimax optimal rate.
翻译:考虑非参数逻辑回归问题。在逻辑回归中,通常采用极大似然估计量,其超额风险定义为真实条件类别概率与估计条件类别概率之间的Kullback-Leibler(KL)散度期望。然而在非参数逻辑回归中,KL散度极易发散,导致超额风险的收敛性难以证明甚至不成立。现有部分研究在强假设条件下证明了KL散度的收敛性。由于我们通常旨在估计真实条件类别概率,因此无需直接分析超额风险,只需证明极大似然估计量在某个适当度量下的相合性即可。本文采用一种统一简约的分析非参数极大似然估计量(NPMLE)的方法,在温和假设下直接推导了NPMLE在Hellinger距离下的收敛速率。尽管我们的结果与现有部分研究相似,但本文提供了更简洁直接的证明过程。作为重要应用,我们推导了基于深度神经网络的NPMLE收敛速率,并证明该速率几乎达到了极小化最优速率。