Many types of bounded data defined on the unit interval arise naturally as ratios of the form $X/(X + Y)$. In the existing literature, the main statistical models proposed for this type of bounded data typically based on the assumption that the random variables $X$ and $Y$ are independent. However, this assumption is often unrealistic in practical applications, where $X$ and $Y$ tend to be correlated due to shared underlying mechanisms or common sources of variability. In this paper, we overcome such limitations and propose a model in which the marginal distributions of the two components are linked by a copula, leading to a more flexible and realistic representation of unit-interval data. In particular, in the proposed model, $X$ and $Y$ are dependent gamma random variables whose joint distribution is specified via Morgenstern's bivariate distribution}, allowing for positive and negative correlations between the components. The mathematical properties and practical applications are rigorously investigated. The resulting distribution exhibits a wide range of shapes, accommodating different degrees of skewness and, for some parameter configurations, more complex density structures. A Monte Carlo simulation study is carried out that shows the good performance of the maximum likelihood estimator in several scenarios of parameter choices. The potential and limitations of efficient likelihood-based computations are also discussed. We evaluate the effectiveness of the new model and its estimates in modeling real-world datasets related to economics.
翻译:许多定义在单位区间上的有界数据自然地以$X/(X + Y)$的比值形式出现。现有文献中,针对此类有界数据提出的主要统计模型通常基于随机变量$X$和$Y$相互独立的假设。然而,在实际应用中,这种假设往往不现实,因为$X$和$Y$会因共享的潜在机制或共同的变异来源而趋向于相关。本文克服了这些局限性,提出了一种模型,其中两个分量的边缘分布通过连接函数(copula)相连接,从而为区间数据提供了更灵活、更现实的表示。具体而言,在所提出的模型中,$X$和$Y$是依赖的伽马随机变量,其联合分布通过莫根斯坦二元分布(Morgenstern's bivariate distribution)指定,允许分量之间存在正相关和负相关。本文严格研究了其数学性质及实际应用。所得分布呈现出广泛的形态,能够适应不同程度的偏度,并且在某些参数配置下,还能呈现更复杂的密度结构。进行的蒙特卡洛模拟研究表明,在多种参数选择场景下,极大似然估计表现出良好的性能。本文还讨论了基于似然的高效计算的潜力与局限性。我们通过经济学相关的真实数据集,评估了新模型及其估计值在建模中的有效性。