We solve the problem of estimating the distribution of presumed i.i.d. observations for the total variation loss. Our approach is based on density models and is versatile enough to cope with many different ones, including some density models for which the Maximum Likelihood Estimator (MLE for short) does not exist. We mainly illustrate the properties of our estimator on models of densities on the line that satisfy a shape constraint. We show that it possesses some similar optimality properties, with regard to some global rates of convergence, as the MLE does when it exists. It also enjoys some adaptation properties with respect to some specific target densities in the model for which our estimator is proven to converge at parametric rate. More important is the fact that our estimator is robust, not only with respect to model misspecification, but also to contamination, the presence of outliers among the dataset and the equidistribution assumption. This means that the estimator performs almost as well as if the data were i.i.d. with density $p$ in a situation where these data are only independent and most of their marginals are close enough in total variation to a distribution with density $p$. We also show that our estimator converges to the average density of the data, when this density belongs to the model, even when none of the marginal densities belongs to it. Our main result on the risk of the estimator takes the form of an exponential deviation inequality which is non-asymptotic and involves explicit numerical constants. We deduce from it several global rates of convergence, including some bounds for the minimax $\mathbb{L}_{1}$-risks over the sets of concave and log-concave densities. These bounds derive from some specific results on the approximation of densities which are monotone, convex, concave and log-concave. Such results may be of independent interest.
翻译:我们解决了在总变差损失下对假设独立同分布观测的分布进行估计的问题。我们的方法基于密度模型,具有足够的通用性以应对多种模型,包括一些最大似然估计(MLE)不存在的密度模型。我们主要通过满足形状约束的直线上的密度模型来说明我们估计量的性质。我们证明,在全局收敛速率方面,该估计量具有与MLE(当MLE存在时)相似的最优性质。此外,它还具有适应模型内特定目标密度的自适应性质,对于这些目标密度,我们证明估计量以参数速率收敛。更重要的是,我们的估计量不仅对模型误设有稳健性,还对数据污染、离群点存在以及等分布假设具有稳健性。这意味着,在数据仅独立且其大部分边际分布在总变差上接近具有密度 $p$ 的分布的情况下,估计量的表现几乎与数据来自密度为 $p$ 的独立同分布数据时一样好。我们还证明,当数据的平均密度属于模型时,即使没有任何边际密度属于该模型,我们的估计量也会收敛到该平均密度。关于估计量风险的主要结果采用指数偏离不等式的形式,该不等式是非渐近的,并包含显式数值常数。我们从中推导出多个全局收敛速率,包括凹密度集和对数凹密度集上的极小极大 $\mathbb{L}_{1}$ 风险的一些界限。这些界限源于关于单调、凸、凹和对数凹密度逼近的特定结果,这些结果可能具有独立的研究价值。