Explainable Artificial Intelligence (XAI) seeks to enhance the transparency and accountability of machine learning systems, yet most methods follow a one-size-fits-all paradigm that neglects user differences in expertise, goals, and cognitive needs. Although Large Language Models can translate technical explanations into natural language, they introduce challenges related to faithfulness and hallucinations. To address these challenges, we present PONTE (Personalized Orchestration for Natural language Trustworthy Explanations), a human-in-the-loop framework for adaptive and reliable XAI narratives. PONTE models personalization as a closed-loop validation and adaptation process rather than prompt engineering. It combines: (i) a low-dimensional preference model capturing stylistic requirements; (ii) a preference-conditioned generator grounded in structured XAI artifacts; and (iii) verification modules enforcing numerical faithfulness, informational completeness, and stylistic alignment, optionally supported by retrieval-grounded argumentation. User feedback iteratively updates the preference state, enabling quick personalization. Automatic and human evaluations across healthcare and finance domains show that the verification-refinement loop substantially improves completeness and stylistic alignment over validation-free generation. Human studies further confirm strong agreement between intended preference vectors and perceived style, robustness to generation stochasticity, and consistently positive quality assessments.
翻译:可解释人工智能(XAI)致力于提升机器学习系统的透明度和可问责性,然而现有方法大多采用“一刀切”范式,忽视了用户在专业知识、目标与认知需求方面的差异。尽管大语言模型能够将技术性解释转化为自然语言,但其在忠实性与幻觉问题方面仍存在挑战。为应对这些挑战,我们提出PONTE(面向自然语言可信解释的个性化编排框架),这是一种人在回路的自适应可靠XAI叙事框架。PONTE将个性化建模为闭环验证与自适应过程而非提示工程,其核心包含:(i)捕捉风格化需求的低维偏好模型;(ii)基于结构化XAI构件构建的条件偏好生成器;(iii)强制执行数值忠实性、信息完备性与风格对齐的验证模块(可选支持检索增强论证)。用户反馈通过迭代更新偏好状态实现快速个性化。在医疗与金融领域的自动评估与人工评估表明,相较于无验证生成方法,验证-优化循环显著提升了信息完备性与风格对齐度。人工研究进一步证实:预期偏好向量与感知风格之间具有高度一致性,系统对生成随机性具有鲁棒性,且质量评估始终保持积极结果。