Model independent techniques for constructing background data templates using generative models have shown great promise for use in searches for new physics processes at the LHC. We introduce a major improvement to the CURTAINs method by training the conditional normalizing flow between two side-band regions using maximum likelihood estimation instead of an optimal transport loss. The new training objective improves the robustness and fidelity of the transformed data and is much faster and easier to train. We compare the performance against the previous approach and the current state of the art using the LHC Olympics anomaly detection dataset, where we see a significant improvement in sensitivity over the original CURTAINs method. Furthermore, CURTAINsF4F requires substantially less computational resources to cover a large number of signal regions than other fully data driven approaches. When using an efficient configuration, an order of magnitude more models can be trained in the same time required for ten signal regions, without a significant drop in performance.
翻译:利用生成模型以模型无关技术构建背景数据模板,在大型强子对撞机(LHC)的新物理过程搜索中展现出巨大潜力。我们通过采用最大似然估计替代最优传输损失函数来训练两侧带区间的条件归一化流,对CURTAINs方法进行了重大改进。新的训练目标提升了变换数据的稳健性与保真度,同时大幅加速并简化了训练过程。基于LHC奥运会异常检测数据集,我们将本方法与原有方法及当前最优方法进行了性能对比,发现灵敏度较原始CURTAINs方法显著提升。此外,与其他纯数据驱动方法相比,CURTAINsF4F覆盖大量信号区域所需的计算资源大幅降低。在高效配置下,无需显著牺牲性能,即可在原有处理十个信号区域所需时间内训练出数量级更多的模型。