We introduce an iterative discrete information production process where we can extend ordered normalised vectors by new elements based on a simple affine transformation, while preserving the predefined level of inequality, G, as measured by the Gini index. Then, we derive the family of empirical Lorenz curves of the corresponding vectors and prove that it is stochastically ordered with respect to both the sample size and G which plays the role of the uncertainty parameter. We prove that asymptotically, we obtain all, and only, Lorenz curves generated by a new, intuitive parametrisation of the finite-mean Pickands' Generalised Pareto Distribution (GPD) that unifies three other families, namely: the Pareto Type II, exponential, and scaled beta distributions. The family is not only totally ordered with respect to the parameter G, but also, thanks to our derivations, has a nice underlying interpretation. Our result may thus shed a new light on the genesis of this family of distributions. Our model fits bibliometric, informetric, socioeconomic, and environmental data reasonably well. It is quite user-friendly for it only depends on the sample size and its Gini index.
翻译:我们提出了一种迭代离散信息生成过程,该过程基于简单的仿射变换,可通过新增元素扩展有序归一化向量,同时保持由基尼指数G测度的预设不平等水平。进而推导出相应向量的经验洛伦兹曲线族,并证明该曲线族在样本容量与不确定性参数G两个维度上具有随机序性质。我们证明,渐近情况下,新提出的有限均值Pickands广义帕累托分布(GPD)的直观参数化形式生成的所有(且仅有的)洛伦兹曲线均能被获取,该参数化形式统一了三种其他分布族:帕累托Ⅱ型、指数分布和缩放贝塔分布。该分布族不仅关于参数G完全有序,而且基于我们的推导具有简洁的潜在解释。因此,我们的结果可能为该分布族的生成机制提供新视角。该模型能合理拟合文献计量学、信息计量学、社会经济及环境数据,因其仅依赖于样本容量及其基尼指数而极具用户友好性。