Guided by the principles of differential privacy protection the Australian Bureau of Statistics modifies the data summaries from the Australian Census provided through TableBuilder to researchers at approved institutions. This modification algorithm includes the injection of a small degree of artificial noise to every nonzero cell count followed by the suppression of very small cell counts to zero. Researchers working with small area TableBuilder outputs with a high suppression fraction have proposed various algorithmic solutions to reconciling these with less suppressed outputs from larger enclosing areas. Here we propose that a Bayesian, likelihood-based statistical approach in which the perturbation algorithm itself is explicitly represented is well suited to analyses with such randomly perturbed data. Using both real (TableBuilder) and mock datasets representing dwelling classifications in the Perth Greater Capital City Area we demonstrate the feasibility and utility of multi-scale Bayesian reconstruction of modified cell counts in a spatial setting.
翻译:在差分隐私保护原则的指导下,澳大利亚统计局对通过TableBuilder平台提供给经批准机构研究人员的澳大利亚人口普查数据摘要进行了修改。这种修改算法包括向每个非零单元格计数注入少量人工噪声,随后将极小的单元格计数抑制为零。针对小区域TableBuilder输出中高抑制比例的问题,研究人员提出了多种算法解决方案,以协调这些数据与较大包围区域中抑制程度较低的输出之间的关系。本文提出一种显式表示扰动算法本身的贝叶斯似然统计方法,该方法非常适合对这类随机扰动数据进行分析。通过使用代表珀斯大首都城区住宅分类的真实(TableBuilder)与模拟数据集,我们展示了多尺度贝叶斯重建修正后单元格计数在空间情境中的可行性与实用性。