In practice, the use of rounding is ubiquitous. Although researchers have looked at the implications of rounding continuous random variables, rounding may be applied to functions of discrete random variables as well. For example, to infer on suicide excess deaths after a national emergency, authorities may provide a rounded average of deaths before and after the emergency started. Suicide rates tend to be relatively low around the world and such rounding may seriously affect inference on the change of suicide rate. In this paper, we study the scenario when a rounded to nearest integer average is used as a proxy for a non-negative discrete random variable. Specifically, our interest is in drawing inference on a parameter from the pmf of Y , when we get U = n[Y /n] as a proxy for Y . The probability generating function of U , E(U ), and Var(U ) capture the effect of the coarsening of the support of Y . Also, moments and estimators of distribution parameters are explored for some special cases. We show that under certain conditions, there is little impact from rounding. However, we also find scenarios where rounding can significantly affect statistical inference as demonstrated in three examples. The simple methods we propose are able to partially counter rounding error effects. While for some probability distributions it may be difficult to derive maximum likelihood estimators as a function of U , we provide a framework to obtain an estimator numerically.
翻译:在实际应用中,四舍五入的使用十分普遍。尽管研究人员已关注连续随机变量四舍五入的影响,但四舍五入同样可能应用于离散随机变量的函数。例如,为推断国家紧急状态后的超额自杀死亡人数,当局可能提供紧急状态前后死亡人数的四舍五入平均值。全球自杀率普遍较低,此类四舍五入可能严重影响对自杀率变化的推断。本文研究了以四舍五入至最近整数的平均值作为非负离散随机变量代理变量的场景。具体而言,我们关注在获得U = n[Y /n]作为Y的代理变量时,如何对Y的概率质量函数参数进行推断。U的概率生成函数、E(U)及Var(U)刻画了Y支撑集粗化的效应。此外,我们探讨了若干特殊情形下的矩与分布参数估计量。研究表明,在特定条件下四舍五入影响甚微,但亦发现了四舍五入可能显著影响统计推断的情景,并通过三个实例加以验证。我们提出的简单方法可部分抵消四舍五入误差效应。对于某些概率分布,虽然可能难以推导以U为函数的最大似然估计量,但我们提供了一个数值求解估计量的框架。