Zero-inflated continuous data ubiquitously appear in many fields, in which lots of exactly zero-valued data are observed while others distribute continuously. Due to the mixed structure of discreteness and continuity in its distribution, statistical analysis is challenging especially for multivariate case. In this paper, we propose two copula-based density estimation models that can cope with multivariate correlation among zero-inflated continuous variables. In order to overcome the difficulty in the use of copulas due to the tied-data problem in zero-inflated data, we propose a new type of copula, rectified Gaussian copula, and present efficient methods for parameter estimation and likelihood computation. Numerical experiments demonstrates the superiority of our proposals compared to conventional density estimation methods.
翻译:零膨胀连续数据广泛出现在许多领域中,这类数据中包含大量精确为零的观测值,而其余观测值则呈连续分布。由于这种离散性与连续性混合的分布结构,统计分析极具挑战性,尤其是在多元情况下。本文提出了两种基于Copula的密度估计模型,能够处理零膨胀连续变量之间的多元相关性。为克服零膨胀数据中因结数据问题(tied-data problem)而导致的Copula使用困难,我们提出了一种新型Copula——修正高斯Copula(rectified Gaussian copula),并给出了参数估计与似然计算的高效方法。数值实验表明,与传统的密度估计方法相比,我们提出的方法具有优越性。