Functional renormalization group for signal detection and stochastic ergodicity breaking

Signal detection is one of the main challenges of data science. As it often happens in data analysis, the signal in the data may be corrupted by noise. There is a wide range of techniques aimed at extracting the relevant degrees of freedom from data. However, some problems remain difficult. It is notably the case of signal detection in almost continuous spectra when the signal-to-noise ratio is small enough. This paper follows a recent bibliographic line which tackles this issue with field-theoretical methods. Previous analysis focused on equilibrium Boltzmann distributions for some effective field representing the degrees of freedom of data. It was possible to establish a relation between signal detection and $\mathbb{Z}_2$-symmetry breaking. In this paper, we consider a stochastic field framework inspiring by the so-called "Model A", and show that the ability to reach or not an equilibrium state is correlated with the shape of the dataset. In particular, studying the renormalization group of the model, we show that the weak ergodicity prescription is always broken for signals small enough, when the data distribution is close to the Marchenko-Pastur (MP) law. This, in particular, enables the definition of a detection threshold in the regime where the signal-to-noise ratio is small enough.

翻译：信号检测是数据科学面临的主要挑战之一。如同数据分析中常见的情况，数据中的信号可能受到噪声干扰。已有多种技术旨在从数据中提取相关自由度，但某些问题仍然难以解决，尤其在信噪比很低时对近乎连续谱中信号的检测便是典型案例。本文遵循近期一条学术研究脉络，采用场论方法处理该问题。以往分析聚焦于表示数据自由度的有效场的平衡玻尔兹曼分布，建立了信号检测与$\mathbb{Z}_2$对称性破缺之间的关联。本文基于所谓“模型A”的启发，考虑随机场框架，证明能否达到平衡态与数据集形态相关。特别地，通过研究模型的重整化群，我们指出当数据分布接近马尔琴科-帕斯图尔（MP）律时，对于足够小的信号，弱遍历性条件总是破缺的。这尤其使得在信噪比足够小的条件下能够定义检测阈值。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日