In learning to defer, a predictor identifies risky decisions and defers them to a human expert. One key issue with this setup is that the expert may end up over-relying on the machine's decisions, due to anchoring bias. At the same time, whenever the machine chooses the deferral option the expert has to take decisions entirely unassisted. As a remedy, we propose learning to guide (LTG), an alternative framework in which -- rather than suggesting ready-made decisions -- the machine provides guidance useful to guide decision-making, and the human is entirely responsible for coming up with a decision. We also introduce SLOG, an LTG implementation that leverages (a small amount of) human supervision to convert a generic large language model into a module capable of generating textual guidance, and present preliminary but promising results on a medical diagnosis task.
翻译:在延迟决策学习中,预测器识别出高风险决策并将其交由人类专家处理。该机制的一个关键问题在于:由于锚定偏差,人类专家可能对机器决策产生过度依赖。与此同时,当机器选择延迟决策选项时,专家不得不完全独立地做出判断。作为解决方案,我们提出"学习引导"(LTG)框架——在该替代性框架中,机器不提供现成决策建议,而是生成有助于指导决策的指引性信息,人类则全权负责最终决策。我们同时提出SLOG实现方案,该方案通过(少量)人类监督将通用大型语言模型转化为可生成文本指引的模块。在医学诊断任务上的初步实验展现了具有前景的结果。