In the last ten years, various automated machine learning (AutoM ) systems have been proposed to build end-to-end machine learning (ML) pipelines with minimal human interaction. Even though such automatically synthesized ML pipelines are able to achieve a competitive performance, recent studies have shown that users do not trust models constructed by AutoML due to missing transparency of AutoML systems and missing explanations for the constructed ML pipelines. In a requirements analysis study with 36 domain experts, data scientists, and AutoML researchers from different professions with vastly different expertise in ML, we collect detailed informational needs for AutoML. We propose XAutoML, an interactive visual analytics tool for explaining arbitrary AutoML optimization procedures and ML pipelines constructed by AutoML. XAutoML combines interactive visualizations with established techniques from explainable artificial intelligence (XAI) to make the complete AutoML procedure transparent and explainable. By integrating XAutoML with JupyterLab, experienced users can extend the visual analytics with ad-hoc visualizations based on information extracted from XAutoML. We validate our approach in a user study with the same diverse user group from the requirements analysis. All participants were able to extract useful information from XAutoML, leading to a significantly increased understanding of ML pipelines produced by AutoML and the AutoML optimization itself.
翻译:在过去十年中,各类自动机器学习(AutoML)系统被提出,旨在以最少人工干预构建端到端机器学习(ML)流水线。尽管这类自动合成的ML流水线能够达到竞争性性能,但近期研究表明,由于AutoML系统缺乏透明性及对所构建ML流水线的解释缺失,用户并不信任AutoML生成的模型。在一项面向36位来自不同专业领域、ML专家能力差异显著的领域专家、数据科学家及AutoML研究人员的需求分析研究中,我们系统收集了AutoML的信息需求。我们提出XAutoML——一种交互式可视化分析工具,用于解释任意AutoML优化流程及其构建的ML流水线。XAutoML将交互式可视化与可解释人工智能(XAI)的成熟技术相结合,使完整的AutoML流程透明化且可解释。通过将XAutoML集成至JupyterLab,经验丰富的用户可基于XAutoML提取的信息,通过临时可视化扩展分析功能。我们基于需求分析中的同一多样化用户群体开展用户研究,验证了本方法。所有参与者均能从XAutoML中提取有效信息,从而显著提升对AutoML生成的ML流水线及AutoML优化过程本身的理解。