LLM-generated interfaces are increasingly used in high-consequence workflows (e.g., healthcare communication), where how information is presented can impact downstream actions. These interfaces and their content support human interaction with AI-assisted decision-making and communication processes and should remain accessible and usable for people with disabilities. Accessible plain-language interfaces serve as an enabling infrastructure for meaningful human oversight. In these contexts, ethical and trustworthiness risks, including hallucinations, semantic distortion, bias, and accessibility barriers, can undermine reliability and limit users' ability to understand, monitor, and intervene in AI-supported processes. Yet, in practice, oversight is often treated as a downstream check, without clear rules for when human intervention is required or who is accountable. We propose oversight-by-design: embedding human judgment across the pipeline as an architectural commitment, implemented via escalation policies and explicit UI controls for risk signalling and intervention. Automated checks flag risk in generated UI communication that supports high-stakes workflows (e.g., readability, semantic fidelity, factual consistency, and standards-based accessibility constraints) and escalate to mandatory Human-in-the-Loop (HITL) review before release when thresholds are violated, or uncertainty is high. Human-on-the-Loop (HOTL) supervision monitors system-level signals over time (alerts, escalation rates, and compliance evidence) to tune policies and detect drift. Structured review feedback is translated into governance actions (rule and prompt updates, threshold calibration, and traceable audit logs), enabling scalable intervention and verifiable oversight for generative UI systems that support high-stakes workflows.
翻译:LLM生成的界面正日益应用于高风险工作流程(如医疗健康沟通),其中信息的呈现方式可能影响后续行动。这些界面及其内容支撑着人类与AI辅助决策及沟通过程的交互,必须保持对残障人士的可访问性与可用性。可访问的简明语言界面构成了实现有效人类监督的基础设施。在此类场景中,包括幻觉、语义扭曲、偏见及可访问性障碍在内的伦理与可信度风险,可能损害系统可靠性并限制用户理解、监控及干预AI辅助流程的能力。然而实践中,监督往往被简化为事后检查,缺乏明确的人类干预触发规则与责任归属机制。我们提出监督设计理念:将人类判断作为架构承诺嵌入全流程,通过升级策略与显式UI控制机制实现风险提示与干预。自动化检查模块对支撑高风险工作流程的生成式UI通信内容(如可读性、语义保真度、事实一致性及基于标准的可访问性约束)进行风险标记,当触发阈值或不确定性较高时,升级至强制性人在回路审查环节方可发布。人在环上监督机制持续监控系统级信号(警报、升级率及合规证据)以优化策略并检测性能漂移。结构化审查反馈将转化为治理行动(规则与提示更新、阈值校准及可追溯审计日志),从而为支撑高风险工作流程的生成式UI系统提供可扩展的干预能力与可验证的监督机制。