We propose a demand estimation method that leverages unstructured text and image data to infer substitution patterns. Using pre-trained deep learning models, we extract embeddings from product images and textual descriptions and incorporate them into a random coefficients logit model. This approach enables researchers to estimate demand even when they lack data on product attributes or when consumers value hard-to-quantify attributes, such as visual design or functional benefits. Using data from a choice experiment, we show that our approach outperforms standard attribute-based models in counterfactual predictions of consumers' second choices. We also apply it across 40 product categories on Amazon.com and consistently find that text and image data help identify close substitutes within each category.
翻译:我们提出一种利用非结构化文本与图像数据推断替代模式的需求估计方法。该方法通过预训练的深度学习模型从产品图像与文本描述中提取嵌入向量,并将其整合至随机系数Logit模型中。此方法使研究者能够在缺乏产品属性数据、或消费者重视难以量化的属性(如视觉设计或功能效益)时仍能进行需求估计。通过选择实验数据,我们证明该方法在消费者次优选择的反事实预测中优于标准基于属性的模型。我们还将该方法应用于Amazon.com的40个产品类别,一致发现文本与图像数据有助于识别各类别内的紧密替代品。