Machine learning (ML) sensors offer a new paradigm for sensing that enables intelligence at the edge while empowering end-users with greater control of their data. As these ML sensors play a crucial role in the development of intelligent devices, clear documentation of their specifications, functionalities, and limitations is pivotal. This paper introduces a standard datasheet template for ML sensors and discusses its essential components including: the system's hardware, ML model and dataset attributes, end-to-end performance metrics, and environmental impact. We provide an example datasheet for our own ML sensor and discuss each section in detail. We highlight how these datasheets can facilitate better understanding and utilization of sensor data in ML applications, and we provide objective measures upon which system performance can be evaluated and compared. Together, ML sensors and their datasheets provide greater privacy, security, transparency, explainability, auditability, and user-friendliness for ML-enabled embedded systems. We conclude by emphasizing the need for standardization of datasheets across the broader ML community to ensure the responsible and effective use of sensor data.
翻译:机器学习(ML)传感器提供了一种全新的传感范式,既能在边缘端实现智能化,又能赋予终端用户更强的数据控制权。随着此类ML传感器在智能设备开发中发挥关键作用,清晰记录其技术规格、功能特性与局限性至关重要。本文提出了一种标准的ML传感器数据表模板,并详细论述了其核心组成部分,包括:系统硬件、机器学习模型与数据集属性、端到端性能指标及环境影响。我们以自研的ML传感器为例提供了完整的数据表示例,逐节展开讨论,阐明这些数据表如何促进对ML应用中传感器数据的理解与利用,同时提供可客观评估和比较系统性能的量化指标。通过ML传感器及其数据表的协同作用,ML赋能的嵌入式系统在隐私性、安全性、透明度、可解释性、可审计性与用户友好性方面均得到显著提升。我们最后强调,机器学习社区亟需推动数据表标准化进程,以确保传感器数据的负责任且有效使用。