Eliciting Model Steering Interactions from Users via Data and Visual Design Probes

Domain experts increasingly use automated data science tools to incorporate machine learning (ML) models in their work but struggle to "debug" these models when they are incorrect. For these experts, semantic interactions can provide an accessible avenue to guide and refine ML models without having to programmatically dive into its technical details. In this research, we conduct an elicitation study using data and visual design probes to examine if and how experts with a spectrum of ML expertise use semantic interactions to update a simple classification model. We use our design probes to facilitate an interactive dialogue with 20 participants and codify their interactions as a set of target-interaction pairs. Interestingly, our findings revealed that many targets of semantic interactions do not directly map to ML model parameters, but instead aim to augment the data a model uses for training. We also identify reasons that participants would hesitate to interact with ML models, including burdens of cognitive load and concerns of injecting bias. Unexpectedly participants also saw the value of using semantic interactions to work collaboratively with members of their team. Participants with less ML expertise found this to be a useful mechanism for communicating their concerns to ML experts. This was an especially important observation, as our study also shows the different needs that correspond to diverse ML expertise. Collectively, we demonstrate that design probes are effective tools for proactively gathering the affordances that should be offered in an interactive machine learning system.

翻译：领域专家越来越多地使用自动化数据科学工具将机器学习（ML）模型融入其工作，但在模型出错时却难以“调试”这些模型。对于这些专家而言，语义交互提供了一种可访问的途径来指导和优化ML模型，而无需深入其技术细节进行编程。在本研究中，我们采用数据和视觉设计探针进行了一项启发式研究，以考察具有不同ML专业知识的专家是否以及如何使用语义交互来更新一个简单的分类模型。我们利用设计探针促进与20名参与者的互动对话，并将其交互编码为一组目标-交互对。有趣的是，我们的发现表明，许多语义交互的目标并非直接映射到ML模型参数，而是旨在增强模型用于训练的数据。我们还识别出参与者犹豫与ML模型交互的原因，包括认知负担和引入偏差的担忧。出乎意料的是，参与者还看到了利用语义交互与团队成员协作的价值。ML专业知识较少的参与者发现这是向ML专家传达其担忧的有效机制。这一观察尤为重要，因为我们的研究还显示，不同ML专业知识水平对应着不同的需求。总体而言，我们证明设计探针是主动收集交互式机器学习系统应提供功能的有效工具。