Lessons Learned from EXMOS User Studies: A Technical Report Summarizing Key Takeaways from User Studies Conducted to Evaluate The EXMOS Platform

In the realm of interactive machine-learning systems, the provision of explanations serves as a vital aid in the processes of debugging and enhancing prediction models. However, the extent to which various global model-centric and data-centric explanations can effectively assist domain experts in detecting and resolving potential data-related issues for the purpose of model improvement has remained largely unexplored. In this technical report, we summarise the key findings of our two user studies. Our research involved a comprehensive examination of the impact of global explanations rooted in both data-centric and model-centric perspectives within systems designed to support healthcare experts in optimising machine learning models through both automated and manual data configurations. To empirically investigate these dynamics, we conducted two user studies, comprising quantitative analysis involving a sample size of 70 healthcare experts and qualitative assessments involving 30 healthcare experts. These studies were aimed at illuminating the influence of different explanation types on three key dimensions: trust, understandability, and model improvement. Results show that global model-centric explanations alone are insufficient for effectively guiding users during the intricate process of data configuration. In contrast, data-centric explanations exhibited their potential by enhancing the understanding of system changes that occur post-configuration. However, a combination of both showed the highest level of efficacy for fostering trust, improving understandability, and facilitating model enhancement among healthcare experts. We also present essential implications for developing interactive machine-learning systems driven by explanations. These insights can guide the creation of more effective systems that empower domain experts to harness the full potential of machine learning

翻译：在交互式机器学习系统领域，解释功能为调试和优化预测模型提供了关键支持。然而，全球范围内以模型为中心和以数据为中心的不同解释类型在多大程度上能有效辅助领域专家检测并解决潜在数据问题以改进模型，这一领域仍鲜有探索。本技术报告总结了两项用户研究的主要发现。我们的研究系统性地考察了基于数据驱动和模型驱动视角的全局解释对系统的影响——这些系统旨在通过自动和手动数据配置支持医疗专家优化机器学习模型。为实证探究这一动态机制，我们开展了两项用户研究，分别包含70名医疗专家的定量分析和30名医疗专家的定性评估。这些研究旨在阐明不同解释类型对三个关键维度（信任度、可理解性和模型优化）的影响。结果表明，仅依赖全局模型中心解释不足以在复杂的数据配置过程中有效引导用户。相比之下，数据驱动解释通过增强用户对配置后系统变化的理解展现出潜力。然而，两种解释类型的组合在促进医疗专家信任度、提升可理解性及推动模型优化方面表现出最高效度。我们还提出了对开发解释驱动的交互式机器学习系统的重要启示，这些见解可指导创建更有效的系统，赋能领域专家充分发挥机器学习的潜力。