Hybrid opto-electronic neural networks combine optical front-ends with electronic back-ends to perform vision tasks, but joint end-to-end (E2E) optimization of optical and electronic components is computationally expensive due to large parameter spaces and repeated optical convolutions. We propose Direct Kernel Optimization (DKO), a two-stage training framework that first trains a conventional electronic CNN and then synthesizes optical kernels to replicate the first-layer convolutional filters, reducing optimization dimensionality and avoiding hefty simulated optical convolutions during optimization. We evaluate DKO in simulation on a monocular depth estimation model and show that it achieves twice the accuracy of E2E training under equal computational budgets while reducing training time. Given the substantial computational challenges of optimizing hybrid opto-electronic systems, our results position DKO as a scalable optimization approach to train and realize these systems.
翻译:混合光电神经网络结合光学前端与电子后端执行视觉任务,但由于参数空间庞大且需重复进行光学卷积运算,对光学与电子组件进行端到端联合优化的计算成本极高。本文提出直接核优化——一种两阶段训练框架:首先训练传统电子卷积神经网络,随后合成光学核以复现首层卷积滤波器,从而降低优化维度并避免在优化过程中进行繁重的模拟光学卷积运算。我们在单目深度估计模型上通过仿真评估DKO,结果表明在相同计算预算下,其精度可达端到端训练的两倍,同时显著缩短训练时间。鉴于优化混合光电系统面临巨大的计算挑战,本研究成果将DKO定位为一种可扩展的优化方法,用于训练并实现此类系统。