In computer vision, it has long been taken for granted that high-quality images obtained through well-designed camera lenses would lead to superior results. However, we find that this common perception is not a "one-size-fits-all" solution for diverse computer vision tasks. We demonstrate that task-driven and deep-learned simple optics can actually deliver better visual task performance. The Task-Driven lens design approach, which relies solely on a well-trained network model for supervision, is proven to be capable of designing lenses from scratch. Experimental results demonstrate the designed image classification lens (``TaskLens'') exhibits higher accuracy compared to conventional imaging-driven lenses, even with fewer lens elements. Furthermore, we show that our TaskLens is compatible with various network models while maintaining enhanced classification accuracy. We propose that TaskLens holds significant potential, particularly when physical dimensions and cost are severely constrained.
翻译:在计算机视觉领域,长期以来人们理所当然地认为,通过精心设计的相机镜头获得的高质量图像将带来更优的结果。然而,我们发现这种普遍认知并非适用于多样化计算机视觉任务的“万能解决方案”。我们证明,任务驱动且通过深度学习获得的简单光学元件实际上能够实现更好的视觉任务性能。这种仅依赖经过良好训练的网络模型进行监督的任务驱动型镜头设计方法,被证明能够从头开始设计镜头。实验结果表明,与传统成像驱动型镜头相比,我们设计的图像分类镜头(“TaskLens”)在元件数量更少的情况下展现出更高的准确性。此外,我们还证明TaskLens与多种网络模型兼容,同时保持增强的分类精度。我们提出,当物理尺寸和成本受到严重限制时,TaskLens具有显著潜力。