It is well known that the relationship between variables at the individual level can be different from the relationship between those same variables aggregated over individuals. In this paper, I develop a methodology to partially identify linear combinations of conditional mean outcomes for individual-level outcomes of interest without imposing parametric assumptions when the researcher only has access to aggregate data. I construct identified sets using an optimization program that allows for researchers to impose additional shape and data restrictions. I also provide consistency results and construct an inference procedure that is valid with data that only provides marginal information about each variable. I apply the methodology to simulated and real-world data sets and find that the estimated identified sets are too wide to be useful, but become narrower as more assumptions are imposed and data aggregated at a finer level is available.
翻译:众所周知,个体层面上变量之间的关系可能与同一变量在群体层面聚合后的关系不同。本文发展了一种方法,在研究者仅能获取聚合数据且不施加参数假设的情况下,对个体水平感兴趣结果的线性组合条件均值进行部分识别。我构建了一个优化程序生成的识别集,允许研究者施加额外的形状和数据约束。同时,我提供了一致性结果,并构建了一种适用于仅提供各变量边际信息的数据的推断方法。我将该方法应用于模拟数据集和真实数据集,发现估计的识别集过宽而缺乏实用性,但随着施加更多假设以及获取更精细层面聚合数据,识别集逐渐变窄。