Transfer learning-assisted inverse modeling in nanophotonics based on mixture density networks

The simulation of nanophotonic structures relies on electromagnetic solvers, which play a crucial role in understanding their behavior. However, these solvers often come with a significant computational cost, making their application in design tasks, such as optimization, impractical. To address this challenge, machine learning techniques have been explored for accurate and efficient modeling and design of photonic devices. Deep neural networks, in particular, have gained considerable attention in this field. They can be used to create both forward and inverse models. An inverse modeling approach avoids the need for coupling a forward model with an optimizer and directly performs the prediction of the optimal design parameters values. In this paper, we propose an inverse modeling method for nanophotonic structures, based on a mixture density network model enhanced by transfer learning. Mixture density networks can predict multiple possible solutions at a time including their respective importance as Gaussian distributions. However, multiple challenges exist for mixture density network models. An important challenge is that an upper bound on the number of possible simultaneous solutions needs to be specified in advance. Also, another challenge is that the model parameters must be jointly optimized, which can result computationally expensive. Moreover, optimizing all parameters simultaneously can be numerically unstable and can lead to degenerate predictions. The proposed approach allows overcoming these limitations using transfer learning-based techniques, while preserving a high accuracy in the prediction capability of the design solutions given an optical response as an input. A dimensionality reduction step is also explored. Numerical results validate the proposed method.

翻译：纳米光子结构的仿真依赖于电磁求解器，这些求解器在理解其行为方面起着关键作用。然而，这些求解器通常计算成本高昂，使其在优化等设计任务中的应用变得不切实际。为应对这一挑战，研究人员探索了机器学习技术以实现光子器件的精确高效建模与设计。其中，深度神经网络在该领域引起了广泛关注，可用于构建正向模型和逆向模型。逆向建模方法无需将正向模型与优化器耦合，可直接预测最优设计参数值。本文提出了一种基于迁移学习增强的混合密度网络模型的纳米光子结构逆向建模方法。混合密度网络能同时预测多个可能解及其各自的重要性（以高斯分布形式表示），但该模型存在多重挑战：首先需预先指定可能同时解数量的上限；其次模型参数必须联合优化，这将带来较高计算成本；此外，同时优化所有参数可能导致数值不稳定和退化预测。所提方法通过基于迁移学习的技术克服了这些限制，同时在以光学响应为输入时保持设计解预测能力的高精度。本文还探讨了降维步骤。数值结果验证了该方法的有效性。