Semi- and non-parametric mixture of regressions are a very useful flexible class of mixture of regressions in which some or all of the parameters are non-parametric functions of the covariates. These models are, however, based on the Gaussian assumption of the component error distributions. Thus, their estimation is sensitive to outliers and heavy-tailed error distributions. In this paper, we propose semi- and non-parametric contaminated Gaussian mixture of regressions to robustly estimate the parametric and/or non-parametric terms of the models in the presence of mild outliers. The virtue of using a contaminated Gaussian error distribution is that we can simultaneously perform model-based clustering of observations and model-based outlier detection. We propose two algorithms, an expectation-maximization (EM)-type algorithm and an expectation-conditional-maximization (ECM)-type algorithm, to perform maximum likelihood and local-likelihood kernel estimation of the parametric and non-parametric of the proposed models, respectively. The robustness of the proposed models is examined using an extensive simulation study. The practical utility of the proposed models is demonstrated using real data.


翻译:半参数与非参数回归混合模型是一类非常灵活且实用的回归混合模型,其中部分或全部参数是协变量的非参数函数。然而,这些模型基于分量误差分布的高斯假设,因此其估计对异常值和重尾误差分布较为敏感。本文提出半参数与非参数污染高斯回归混合模型,以在存在轻度异常值的情况下稳健地估计模型的参数项和/或非参数项。采用污染高斯误差分布的优势在于,我们可以同时执行基于模型的观测聚类和基于模型的异常值检测。我们提出了两种算法——期望最大化(EM)类算法和期望条件最大化(ECM)类算法,分别用于对所提模型的参数部分和非参数部分进行最大似然估计与局部似然核估计。通过广泛的模拟研究检验了所提模型的稳健性,并利用实际数据展示了其应用价值。

0
下载
关闭预览

相关内容

Top
微信扫码咨询专知VIP会员