Understanding mobile user interfaces is important for building intelligent systems such as automation tools, accessibility solutions, and UI-aware agents. However, progress in this area is still limited by the lack of high-quality datasets that reflect real-world mobile applications and include reliable annotations. In this work, we introduce MUIAnno, a publicly available expert-annotated dataset for mobile UI understanding, collected from a diverse set of applications across multiple categories available on the iTunes platform. Each app was manually explored to capture representative UI screens, resulting in a collection that reflects a wide range of layouts and design patterns found in practice. To ensure annotation quality, we developed a custom web-based tool that allows UI/UX experts to label interface elements through a simple drag-and-drop process and generate structured annotations in JSON format. MUIAnno includes detailed annotations of common UI components such as buttons, input fields, navigation elements, and other key interface elements. In addition to presenting the dataset, we also provide benchmark experiments for UI element detection along with baseline results, offering a starting point for future research. We believe MUIAnno can support further work in mobile UI understanding and help improve systems that rely on accurate interpretation of interface elements.
翻译:理解移动用户界面对于构建智能系统(如自动化工具、无障碍解决方案和界面感知智能体)至关重要。然而,该领域的发展仍受限于缺乏反映真实移动应用场景并包含可靠标注的高质量数据集。本文提出了MUIAnno,一个面向移动界面理解、公开可用的专家标注数据集。该数据集收集自iTunes平台上多个类别中的多样化应用程序。每个应用均通过人工探索方式捕获典型界面截图,最终形成反映实际应用中各类布局与设计模式的数据集合。为确保标注质量,我们开发了自定义的网页版工具,允许UI/UX专家通过简单的拖拽操作对界面元素进行标注,并生成结构化的JSON格式标注结果。MUIAnno包含对常见界面组件的详细标注,如按钮、输入框、导航元素及其他关键界面元素。除数据集外,我们还提供了界面元素检测的基准实验与基线结果,为未来研究提供起点。我们相信MUIAnno能够支持移动界面理解的进一步研究,并帮助改进依赖界面元素精确解读的相关系统。