Density power divergence (DPD) [Basu et al. (1998), Biometrika], which is designed to estimate the underlying distribution of the observations robustly against outliers, comprises an integral term of the power of the parametric density models to be estimated. While the explicit form of the integral term can be obtained for some specific densities (such as normal density and exponential density), its computational intractability has prohibited the application of DPD-based estimation to more general parametric densities, over a quarter of a century since the proposal of DPD. This study proposes a simple stochastic optimization approach to minimize DPD for general parametric density models and explains its adequacy by referring to conventional theories on stochastic optimization. The proposed approach also can be applied to the minimization of another density power-based $\gamma$-divergence with the aid of unnormalized models.
翻译:密度幂散度(DPD)[Basu等人(1998),《生物计量学》]旨在稳健估计观测数据的潜在分布(对异常值具有鲁棒性),其包含一个待估计参数密度模型幂次的积分项。尽管对于某些特定密度(如正态密度和指数密度)可显式获得该积分项,但自DPD提出至今四分之一世纪以来,其计算难解性始终阻碍着DPD估计方法应用于更一般的参数密度模型。本研究提出一种简单随机优化方法以最小化一般参数密度模型的DPD,并参照随机优化经典理论阐述其适用性。借助非归一化模型,所提方法还可应用于另一类基于密度幂次的γ-散度的最小化。