Interactive multi-objective optimization systems face a budget allocation dilemma: one can spend resources on expensive objective evaluations or on eliciting decision-maker preferences that identify the relevant region of the Pareto set. Moreover, preference elicitation itself spans modalities with different information content and cognitive burden, ranging from cheap, noisy pairwise preference statements (PS) to richer but costlier indifference adjustments (IA). We study cost-aware optimization under an unknown scalarization and introduce QUIVER (Query-Informed Value Estimation for Regret), a surrogate-assisted evolutionary multi-objective optimizer that adaptively chooses between objective evaluations and heterogeneous preference queries. At each step, QUIVER selects the next action by maximizing the expected decision-quality improvement per unit total cost. Across DTLZ and WFG benchmarks under synthetic decision-maker models, QUIVER achieves the lowest final utility regret on challenging WFG problems (utility regret of 2.14 on WFG4, 2.82 on WFG9: a 25% improvement over baselines), outperforming all single-modality baselines. We analyze how the optimal mix of PS and IA adapts to problem difficulty: on easy problems (DTLZ2), QUIVER selects 80\% PS queries; on hard problems (WFG9), it shifts to 35% IA queries. This adaptive modality selection demonstrates cost-aware preference learning in action.
翻译:交互式多目标优化系统面临预算分配困境:既可将资源用于昂贵的目标评估,也可用于获取决策者偏好以识别帕累托集的相关区域。此外,偏好获取本身涵盖不同信息量及认知负荷的模态,从廉价但含噪音的成对偏好陈述(PS)到信息更丰富但成本更高的无差异调整(IA)。本文研究未知标量化函数下的成本感知优化问题,提出QUIVER(面向遗憾的查询信息价值估计)——一种代理辅助的进化多目标优化器,可自适应选择目标评估与异质偏好查询。在每一步中,QUIVER通过最大化单位总成本的期望决策质量改进来选择下一动作。在合成决策者模型下的DTLZ与WFG基准测试中,QUIVER在具有挑战性的WFG问题上实现了最低最终效用遗憾(WFG4上效用遗憾为2.14,WFG9上为2.82,相较基线提升25%),优于所有单模态基线。我们分析了PS与IA的最优混合如何随问题难度自适应调整:在简单问题(DTLZ2)上,QUIVER选择80%的PS查询;在困难问题(WFG9)上,则转向35%的IA查询。这种自适应模态选择展示了实际运行中的成本感知偏好学习。