探索端到端可微分神经带电粒子追踪——从损失函数景观视角 (Exploring End-to-end Differentiable Neural Charged Particle Tracking -- A Loss Landscape Perspective)

Measurement and analysis of high energetic particles for scientific, medical or industrial applications is a complex procedure, requiring the design of sophisticated detector and data processing systems. The development of adaptive and differentiable software pipelines using a combination of conventional and machine learning algorithms is therefore getting ever more important to optimize and operate the system efficiently while maintaining end-to-end (E2E) differentiability. We propose for the application of charged particle tracking an E2E differentiable decision-focused learning scheme using graph neural networks with combinatorial components solving a linear assignment problem for each detector layer. We demonstrate empirically that including differentiable variations of discrete assignment operations allows for efficient network optimization, working better or on par with approaches that lack E2E differentiability. In additional studies, we dive deeper into the optimization process and provide further insights from a loss landscape perspective. We demonstrate that while both methods converge into similar performing, globally well-connected regions, they suffer under substantial predictive instability across initialization and optimization methods, which can have unpredictable consequences on the performance of downstream tasks such as image reconstruction. We also point out a dependency between the interpolation factor of the gradient estimator and the prediction stability of the model, suggesting the choice of sufficiently small values. Given the strong global connectivity of learned solutions and the excellent training performance, we argue that E2E differentiability provides, besides the general availability of gradient information, an important tool for robust particle tracking to mitigate prediction instabilities by favoring solutions that perform well on downstream tasks.

翻译：针对科学、医疗或工业应用的高能粒子测量与分析是一个复杂过程，需要设计精密的探测器与数据处理系统。采用传统算法与机器学习算法相结合的、具备自适应性与可微分性的软件流水线开发，对于在保持端到端可微分性的同时高效优化与操作系统而言正变得日益重要。针对带电粒子追踪应用，我们提出一种端到端可微分决策聚焦学习方案，该方案采用图神经网络与组合优化组件相结合的方式，为每个探测器层求解线性分配问题。我们通过实验证明，引入离散分配操作的可微分变体能够实现高效的网络优化，其效果优于或等同于缺乏端到端可微分性的方法。在进一步研究中，我们深入探究优化过程，并从损失函数景观视角提供更多洞见。我们证明，尽管两种方法均收敛至性能相似、全局连通性良好的区域，但它们在初始化与优化方法方面存在显著的预测不稳定性，这可能对图像重建等下游任务的性能产生不可预测的影响。我们还指出梯度估计器的插值因子与模型预测稳定性之间存在依赖关系，建议选择足够小的插值因子值。鉴于学习解具有强全局连通性且训练性能优异，我们认为端到端可微分性除了提供通用的梯度信息外，还通过优先选择在下游任务中表现良好的解，为鲁棒的粒子追踪提供了缓解预测不稳定性的重要工具。