Long-term monitoring and exploration of extreme environments, such as underwater storage facilities, is costly, labor-intensive, and hazardous. Automating this process with low-cost, collaborative robots can greatly improve efficiency. These robots capture images from different positions, which must be processed simultaneously to create a spatio-temporal model of the facility. In this paper, we propose a novel approach that integrates data simulation, a multi-modal deep learning network for coordinate prediction, and image reassembly to address the challenges posed by environmental disturbances causing drift and rotation in the robots' positions and orientations. Our approach enhances the precision of alignment in noisy environments by integrating visual information from snapshots, global positional context from masks, and noisy coordinates. We validate our method through extensive experiments using synthetic data that simulate real-world robotic operations in underwater settings. The results demonstrate very high coordinate prediction accuracy and plausible image assembly, indicating the real-world applicability of our approach. The assembled images provide clear and coherent views of the underwater environment for effective monitoring and inspection, showcasing the potential for broader use in extreme settings, further contributing to improved safety, efficiency, and cost reduction in hazardous field monitoring. Code is available on https://github.com/ChrisChen1023/Micro-Robot-Swarm.
翻译:极端环境(如水下存储设施)的长期监测与勘探成本高昂、劳动密集且充满危险。利用低成本协作机器人实现该过程的自动化可大幅提升效率。这些机器人从不同位置采集图像,需同步处理以构建设施的时空模型。本文提出一种创新方法,通过整合数据模拟、用于坐标预测的多模态深度学习网络及图像重组技术,以应对环境扰动导致机器人位姿漂移与旋转的挑战。该方法通过融合快照的视觉信息、掩码的全局位置上下文及含噪坐标,显著提升了噪声环境下的配准精度。我们使用模拟真实水下机器人作业的合成数据进行了大量实验验证,结果表明该方法具有极高的坐标预测精度与可靠的图像重组效果,证明了其实际应用价值。重组后的图像为水下环境提供了清晰连贯的观测视图,可有效支持监测与巡检任务,展现了在极端环境中广泛应用的潜力,进一步有助于提升危险现场监测的安全性、效率并降低成本。代码发布于 https://github.com/ChrisChen1023/Micro-Robot-Swarm。