Empirical models of multi-product demand rely on low-dimensional product representations to capture substitution patterns, increasingly using proxies built from unstructured data. When proxies are imperfect, standard workflows yield biased counterfactuals and invalid inference. We develop a practical toolkit to address these issues. Our methods apply to market-level and/or individual data, require minimal additional computation, provide simple standard-error formulas, and accommodate proxies from fine-tuned models. Further, we propose diagnostics to assess proxy quality. Our methods yield meaningful improvements in predicting substitution in empirically calibrated simulations and in an application where we assess counterfactual prediction performance against a ground truth.
翻译:暂无翻译