In some applied scenarios, the availability of complete data is restricted, often due to privacy concerns, and only aggregated, robust and inefficient statistics derived from the data are accessible. These robust statistics are not sufficient, but they demonstrate reduced sensitivity to outliers and offer enhanced data protection due to their higher breakdown point. In this article, operating within a parametric framework, we propose a method to sample from the posterior distribution of parameters conditioned on different robust and inefficient statistics: specifically, the pairs (median, MAD) or (median, IQR), or one or more quantiles. Leveraging a Gibbs sampler and the simulation of latent augmented data, our approach facilitates simulation according to the posterior distribution of parameters belonging to specific families of distributions. We demonstrate its applicability on the Gaussian, Cauchy, and translated Weibull families.
翻译:在一些应用场景中,完整数据的获取受到限制,这通常是由于隐私问题,只能获得从数据中衍生的聚合、稳健且低效的统计量。这些稳健统计量并非充分统计量,但它们对异常值的敏感性较低,并且由于较高的崩溃点而提供了更强的数据保护。在本文中,我们基于参数化框架,提出了一种方法,用于从以不同稳健且低效统计量为条件的参数后验分布中进行采样:具体而言,即(中位数,MAD)或(中位数,IQR)对,或一个或多个分位数。利用吉布斯采样器和潜在增广数据的模拟,我们的方法便于根据特定分布族的参数后验分布进行模拟。我们在高斯分布、柯西分布和平移威布尔分布族上展示了其适用性。