Inferring flavor mixtures in multijet events

from arxiv, 23 pages, 13 figures. Response to Referees in journal https://scipost.org/submission/2404.01387v2/. No change in reasoning nor conclusion, but a new section comparing new method to usual methods, including a ROC curve comparison, has been added. Also a more realistic case with 1% signal is shown

Multijet events with heavy-flavors are of central importance at the LHC since many relevant processes -- such as $t\bar t$, $hh$, $t\bar t h$ and others -- have a preferred branching ratio for this final state. Current techniques for tackling these processes use hard-assignment selections through $b$-tagging working points, and suffer from systematic uncertainties because of the difficulties in Monte Carlo simulations. We develop a flexible Bayesian mixture model approach to simultaneously infer $b$-tagging score distributions and the flavor mixture composition in the dataset. We model multidimensional jet events, and to enhance estimation efficiency, we design structured priors that leverages the continuity and unimodality of the $b$-tagging score distributions. Remarkably, our method eliminates the need for a parametric assumption and is robust against model misspecification -- It works for arbitrarily flexible continuous curves and is better if they are unimodal. We have run a toy inferential process with signal $bbbb$ and backgrounds $bbcc$ and $cccc$, and we find that with a few hundred events we can recover the true mixture fractions of the signal and backgrounds, as well as the true $b$-tagging score distribution curves, despite their arbitrariness and nonparametric shapes. We discuss prospects for taking these findings into a realistic scenario in a physics analysis. The presented results could be a starting point for a different and novel kind of analysis in multijet events, with a scope competitive with current state-of-the-art analyses. We also discuss the possibility of using these results in general cases of signals and backgrounds with approximately known continuous distributions and/or expected unimodality.

翻译：含重味道的多喷注事件在LHC中具有核心重要性，因为许多相关过程——例如$t\bar t$、$hh$、$t\bar t h$等——对该末态具有优先的分支比。当前处理这些过程的技术通过$b$标记工作点采用硬分配选择方法，并因蒙特卡洛模拟的困难而承受较大的系统不确定性。我们开发了一种灵活的贝叶斯混合模型方法，可同时推断数据集中$b$标记得分分布与味道混合成分。我们对多维喷注事件进行建模，并通过设计利用$b$标记得分分布连续性与单峰性的结构化先验来提升估计效率。值得注意的是，我们的方法无需参数化假设，且对模型设定错误具有鲁棒性——该方法适用于任意灵活的连续曲线，当曲线具有单峰性时效果更佳。我们以信号$bbbb$及本底$bbcc$、$cccc$进行了玩具推断实验，发现仅需数百个事件即可准确恢复信号与本底的真实混合比例，以及真实的$b$标记得分分布曲线——即使这些曲线具有任意性且呈非参数形态。我们探讨了将这些发现应用于实际物理分析的前景。所呈现的结果可能为多喷注事件分析开辟一条新颖的研究路径，其分析范围可与当前最先进的分析方法相竞争。我们还讨论了在信号与本底具有近似已知连续分布和/或预期单峰性的一般情形中应用这些结果的可能性。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日