Eliciting information to reduce uncertainty about latent group-level properties from surveys and other collective assessments requires allocating limited questioning effort under real costs and missing data. Although large language models enable adaptive, multi-turn interactions in natural language, most existing elicitation methods optimize what to ask with a fixed respondent pool, and do not adapt respondent selection or leverage population structure when responses are partial or incomplete. To address this gap, we study adaptive group elicitation, a multi-round setting where an agent adaptively selects both questions and respondents under explicit query and participation budgets. We propose a theoretically grounded framework that combines (i) an LLM-based expected information gain objective for scoring candidate questions with (ii) heterogeneous graph neural network propagation that aggregates observed responses and participant attributes to impute missing responses and guide per-round respondent selection. This closed-loop procedure queries a small, informative subset of individuals while inferring population-level responses via structured similarity. Across three real-world opinion datasets, our method consistently improves population-level response prediction under constrained budgets, including a >12% relative gain on CES at a 10% respondent budget.
翻译:从调查及其他集体评估中获取信息以减少对潜在群体层面属性的不确定性,需要在真实成本和数据缺失的约束下分配有限的提问资源。尽管大语言模型支持自然语言的自适应多轮交互,但现有的大多数信息获取方法仅针对固定受访者群体优化提问内容,而未在响应部分或不完整时调整受访者选择或利用群体结构。为弥补这一不足,我们研究自适应群体信息获取——一种多轮交互场景,其中智能体在明确的查询预算和参与预算约束下,自适应地选择问题与受访者。我们提出一个理论完备的框架,该框架结合:(i)基于大语言模型的期望信息增益目标,用于评估候选问题的信息价值;(ii)异构图神经网络传播机制,通过聚合已观测响应与参与者属性来补全缺失响应,并指导每轮的受访者选择。这一闭环流程通过查询少量信息量高的个体子集,同时借助结构化相似性推断群体层面的响应。在三个真实世界意见数据集上的实验表明,在受限预算条件下,我们的方法持续提升了群体层面响应预测的准确性,其中在10%受访者预算下,CES数据集上的相对增益超过12%。