When language models answer open-ended problems, they implicitly make hidden decisions that shape their outputs, leaving users with uncontextualized answers rather than a working map of the problem; drawing on multiverse analysis from statistics, we build and evaluate the conceptual multiverse, an interactive system that represents conceptual decisions such as how to frame a question or what to value as a space users can transparently inspect, intervenably change, and check against principled domain reasoning; for this structure to be worth navigating rather than misleading, it must be rigorous and checkable against domain reasoning norms, so we develop a general verification framework that enforces properties of good decision structures like unambiguity and completeness calibrated by expert-level reasoning; across three domains, the conceptual multiverse helped participants develop a working map of the problem, with philosophy students rewriting essays with sharper framings and reversed theses, alignment annotators moving from surface preferences to reasoning about user intent and harm, and poets identifying compositional patterns that clarified their taste.
翻译:当语言模型回答开放式问题时,它们会隐式地做出影响输出的隐藏决策,导致用户获取的是缺乏上下文的答案而非问题的可操作地图。借鉴统计学中的多重宇宙分析,我们构建并评估了概念多重宇宙——一个交互式系统,将概念性决策(如问题框架的构建方式或价值取向)表征为可透明检查、可干预修改,并能与原则性领域推理相互校验的空间。为使该结构值得探索而非产生误导,它必须严谨且符合领域推理规范。为此我们开发了通用验证框架,通过专家级推理校准,强制执行良好决策结构的性质(如无歧义性和完备性)。在三个领域中,概念多重宇宙帮助参与者构建了问题的可操作地图:哲学专业学生通过更精准的框架重写论文并逆转论点;对齐标注者从表层偏好转向对用户意图与伤害的推理;诗人则识别出阐明其审美倾向的构成模式。