Consider the nonparametric logistic regression problem. In the logistic regression, we usually consider the maximum likelihood estimator, and the excess risk is the expectation of the Kullback-Leibler (KL) divergence between the true and estimated conditional class probabilities. However, in the nonparametric logistic regression, the KL divergence could diverge easily, and thus, the convergence of the excess risk is difficult to prove or does not hold. Several existing studies show the convergence of the KL divergence under strong assumptions. In most cases, our goal is to estimate the true conditional class probabilities. Thus, instead of analyzing the excess risk itself, it suffices to show the consistency of the maximum likelihood estimator in some suitable metric. In this paper, using a simple unified approach for analyzing the nonparametric maximum likelihood estimator (NPMLE), we directly derive convergence rates of the NPMLE in the Hellinger distance under mild assumptions. Although our results are similar to the results in some existing studies, we provide simple and more direct proofs for these results. As an important application, we derive convergence rates of the NPMLE with fully connected deep neural networks and show that the derived rate nearly achieves the minimax optimal rate.
翻译:考虑非参数逻辑回归问题。在逻辑回归中,我们通常考虑极大似然估计量,其超额风险是真实条件类别概率与估计条件类别概率之间Kullback-Leibler(KL)散度的期望。然而,在非参数逻辑回归中,KL散度极易发散,因此超额风险的收敛性难以证明或不成立。现有若干研究在强假设条件下证明了KL散度的收敛性。在多数情况下,我们的目标是估计真实条件类别概率。因此,与其分析超额风险本身,不如证明极大似然估计量在某种合适度量下的相合性。本文采用分析非参数极大似然估计量(NPMLE)的简明统一方法,在温和假设条件下直接推导出NPMLE在Hellinger距离下的收敛速率。尽管我们的结果与某些现有研究结果相似,但为此类结果提供了更简洁直接的证明。作为一个重要应用,我们推导了全连接深度神经网络NPMLE的收敛速率,并证明所得速率近乎达到极小极大最优速率。