Consumer research costs companies billions annually yet suffers from panel biases and limited scale. Large language models (LLMs) offer an alternative by simulating synthetic consumers, but produce unrealistic response distributions when asked directly for numerical ratings. We present semantic similarity rating (SSR), a method that elicits textual responses from LLMs and maps these to Likert distributions using embedding similarity to reference statements. Testing on an extensive dataset comprising 57 personal care product surveys conducted by a leading corporation in that market (9,300 human responses), SSR achieves 90% of human test-retest reliability while maintaining realistic response distributions (KS similarity > 0.85). Additionally, these synthetic respondents provide rich qualitative feedback explaining their ratings. This framework enables scalable consumer research simulations while preserving traditional survey metrics and interpretability.
翻译:消费者研究每年耗费企业数十亿美元,却仍受限于样本偏差与规模局限。大语言模型通过模拟合成消费者提供了替代方案,但直接要求其给出数值评分时会产生不现实的响应分布。本文提出语义相似度评分法,该方法从大语言模型中获取文本响应,并通过嵌入向量与参考陈述的相似度映射到李克特分布。在由某市场领先企业开展的包含57项个人护理产品调查(含9,300份人类响应)的扩展数据集上进行测试,语义相似度评分法达到了人类重测信度的90%,同时保持真实的响应分布(KS相似度>0.85)。此外,这些合成受访者能够提供解释其评分的丰富定性反馈。该框架在保持传统调查指标与可解释性的同时,实现了可扩展的消费者研究模拟。