The advancement of deep learning in object detection has predominantly focused on megapixel images, leaving a critical gap in the efficient processing of gigapixel images. These super high-resolution images present unique challenges due to their immense size and computational demands. To address this, we introduce 'SaccadeDet', an innovative architecture for gigapixel-level object detection, inspired by the human eye saccadic movement. The cornerstone of SaccadeDet is its ability to strategically select and process image regions, dramatically reducing computational load. This is achieved through a two-stage process: the 'saccade' stage, which identifies regions of probable interest, and the 'gaze' stage, which refines detection in these targeted areas. Our approach, evaluated on the PANDA dataset, not only achieves an 8x speed increase over the state-of-the-art methods but also demonstrates significant potential in gigapixel-level pathology analysis through its application to Whole Slide Imaging.
翻译:深度学习在目标检测领域的进展主要集中于百万像素图像,对于千兆像素图像的高效处理仍存在关键空白。这些超高分辨率图像因其巨大尺寸和计算需求带来独特挑战。为此,我们受人类眼球扫视运动启发,提出"SaccadeDet"——一种用于千兆像素级目标检测的创新架构。该架构的核心在于能够策略性地选择并处理图像区域,从而显著降低计算负荷。这通过两阶段流程实现:"扫视"阶段识别潜在关注区域,"凝视"阶段在这些目标区域进行精细化检测。我们在PANDA数据集上的评估表明,该方法不仅较现有最优方法实现8倍速度提升,且通过在全切片成像中的应用,展现出在千兆像素级病理分析领域的巨大潜力。