With the rapid development of industrial intelligence and unmanned inspection, reliable perception and safety assessment for AI systems in complex and dynamic industrial sites has become a key bottleneck for deploying predictive maintenance and autonomous inspection. Most public datasets remain limited by simulated data sources, single-modality sensing, or the absence of fine-grained object-level annotations, which prevents robust scene understanding and multimodal safety reasoning for industrial foundation models. To address these limitations, InspecSafe-V1 is released as the first multimodal benchmark dataset for industrial inspection safety assessment that is collected from routine operations of real inspection robots in real-world environments. InspecSafe-V1 covers five representative industrial scenarios, including tunnels, power facilities, sintering equipment, oil and gas petrochemical plants, and coal conveyor trestles. The dataset is constructed from 41 wheeled and rail-mounted inspection robots operating at 2,239 valid inspection sites, yielding 5,013 inspection instances. For each instance, pixel-level segmentation annotations are provided for key objects in visible-spectrum images. In addition, a semantic scene description and a corresponding safety level label are provided according to practical inspection tasks. Seven synchronized sensing modalities are further included, including infrared video, audio, depth point clouds, radar point clouds, gas measurements, temperature, and humidity, to support multimodal anomaly recognition, cross-modal fusion, and comprehensive safety assessment in industrial environments.
翻译:随着工业智能化和无人化巡检的快速发展,在复杂动态的工业现场实现AI系统的可靠感知与安全评估,已成为部署预测性维护与自主巡检的关键瓶颈。现有公开数据集大多受限于模拟数据源、单模态传感或缺乏细粒度物体级标注,阻碍了工业基础模型实现鲁棒的场景理解与多模态安全推理。为应对这些局限,本文发布InspecSafe-V1——首个面向工业巡检安全评估的多模态基准数据集,其数据采集自真实环境中巡检机器人的例行作业。InspecSafe-V1涵盖隧道、电力设施、烧结设备、油气化工厂及输煤栈桥五类典型工业场景。数据集由41台轮式与轨道式巡检机器人在2,239个有效巡检站点运行构建,共产生5,013个巡检实例。每个实例均提供可见光图像中关键物体的像素级分割标注,并依据实际巡检任务提供语义场景描述及对应的安全等级标签。为进一步支持工业环境中的多模态异常识别、跨模态融合与综合安全评估,数据集还包含七种同步传感模态:红外视频、音频、深度点云、雷达点云、气体测量、温度与湿度。