The 2020 Census Disclosure Avoidance System (DAS) is a formally private mechanism that first adds independent noise to cross tabulations for a set of pre-specified hierarchical geographic units, which is known as the geographic spine. After post-processing these noisy measurements, DAS outputs a formally private database with fields indicating location in the standard census geographic spine, which is defined by the United States as a whole, states, counties, census tracts, block groups, and census blocks. This paper describes how the geographic spine used internally within DAS to define the initial noisy measurements impacts accuracy of the output database. Specifically, tabulations for geographic areas tend to be most accurate for geographic areas that both 1) can be derived by aggregating together geographic units above the block geographic level of the internal spine, and 2) are closer to the geographic units of the internal spine. After describing the accuracy tradeoffs relevant to the choice of internal DAS geographic spine, we provide the settings used to define the 2020 Census production DAS runs.
翻译:2020年人口普查披露规避系统(DAS)是一种形式上的隐私保护机制,它首先针对一组预定义的分层地理单元(即地理脊柱)向交叉制表添加独立噪声。在对这些含噪声的测量值进行后处理后,DAS输出一个形式上隐私的数据库,其字段指示标准人口普查地理脊柱中的位置——该脊柱由美国整体、州、县、人口普查区块、区块组及人口普查块构成。本文描述了DAS内部用于定义初始含噪声测量值的地理脊柱如何影响输出数据库的准确性。具体而言,地理区域的制表在以下情况下最为准确:1) 可通过聚合内部脊柱中块级以上的地理单元推导得出,且2) 更接近内部脊柱的地理单元。在描述与内部DAS地理脊柱选择相关的准确性权衡关系后,我们提供了用于定义2020年人口普查生产DAS运行的具体设置。