通用离群值假设检验：基于均值与中位数的检验方法 (Universal Outlier Hypothesis Testing via Mean- and Median-Based Tests) - 专知论文

会员服务 ·

0

序列 · 中位数 · 检验方法 · 假设检验 · 均值 ·

Universal Outlier Hypothesis Testing via Mean- and Median-Based Tests

翻译：通用离群值假设检验：基于均值与中位数的检验方法

Bernhard C. Geiger,Tobias Koch,Josipa Mihaljević,Maximilian Toller

from arxiv, 8 pages, 3 figures; accepted for publication at the International Zurich Seminar on Information and Communication

Universal outlier hypothesis testing refers to a hypothesis testing problem where one observes a large number of length-$n$ sequences -- the majority of which are distributed according to the typical distribution $π$ and a small number are distributed according to the outlier distribution $μ$ -- and one wishes to decide, which of these sequences are outliers without having knowledge of $π$ and $μ$. In contrast to previous works, in this paper it is assumed that both the number of observation sequences and the number of outlier sequences grow with the sequence length. In this case, the typical distribution $π$ can be estimated by computing the mean over all observation sequences, provided that the number of outlier sequences is sublinear in the total number of sequences. It is demonstrated that, in this case, one can achieve the error exponent of the maximum likelihood test that has access to both $π$ and $μ$. However, this mean-based test performs poorly when the number of outlier sequences is proportional to the total number of sequences. For this case, a median-based test is proposed that estimates $π$ as the median of all observation sequences. It is demonstrated that the median-based test achieves again the error exponent of the maximum likelihood test that has access to both $π$ and $μ$, but only with probability approaching one. To formalize this case, the typical error exponent -- similar to the typical random coding exponent introduced in the context of random coding for channel coding -- is proposed.

翻译：通用离群值假设检验指一类假设检验问题：观测到大量长度为 $n$ 的序列——其中绝大多数服从典型分布 $π$，少量服从离群分布 $μ$——目标是在未知 $π$ 和 $μ$ 的情况下判定哪些序列属于离群值。与先前研究不同，本文假设观测序列数与离群序列数均随序列长度增长。在此情况下，若离群序列数相对于总序列数为次线性增长，则可通过计算所有观测序列的均值来估计典型分布 $π$。研究证明，此时可达到已知 $π$ 和 $μ$ 的最大似然检验的错误指数。然而，当离群序列数与总序列数成比例时，这种基于均值的检验方法表现较差。针对此情形，本文提出基于中位数的检验方法，通过计算所有观测序列的中位数来估计 $π$。研究证明，该中位数检验方法能以概率趋近于一的特性，再次达到已知 $π$ 和 $μ$ 的最大似然检验的错误指数。为严格描述此情形，本文提出了典型错误指数的概念——其思想类似于信道编码随机编码理论中引入的典型随机编码指数。

0

相关内容

数学上，序列是被排成一列的对象（或事件）；这样每个元素不是在其他元素之前，就是在其他元素之后。这里，元素之间的顺序非常重要。

GPT-4V在异常检测表现如何？通用异常检测新曙光：华科大等揭秘GPT-4V的全方位异常检测表现

GPT-4V在异常检测表现如何？通用异常检测新曙光：华科大等揭秘GPT-4V的全方位异常检测表现

专知会员服务

39+阅读 · 2023年11月11日

《基于高斯混合流和入包的异常检测》2023最新57页论文

《基于高斯混合流和入包的异常检测》2023最新57页论文

专知会员服务

28+阅读 · 2023年5月15日

【剑桥大学博士论文】模型不确定性下的统计假设检验，198页pdf

【剑桥大学博士论文】模型不确定性下的统计假设检验，198页pdf

专知会员服务

26+阅读 · 2023年2月7日

索邦大学121页博士论文《时间序列中的无监督异常检测》

索邦大学121页博士论文《时间序列中的无监督异常检测》

专知会员服务

103+阅读 · 2022年7月25日

【AAAI2022】基于图神经网络的统一离群点异常检测方法

【AAAI2022】基于图神经网络的统一离群点异常检测方法

专知会员服务

28+阅读 · 2022年2月12日

NTU最新《广义分布外OOD检测》综述论文，20页pdf阐述离群/异常/新类/开集/分布外检测的异同

NTU最新《广义分布外OOD检测》综述论文，20页pdf阐述离群/异常/新类/开集/分布外检测的异同

专知会员服务

29+阅读 · 2021年10月26日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【O’Reilly讲座】基于深度学习的异常检测方法用于检测大型数据集的质量：Anomaly detection using deep learning to measure the quality of large datasets

【O’Reilly讲座】基于深度学习的异常检测方法用于检测大型数据集的质量：Anomaly detection using deep learning to measure the quality of large datasets

专知会员服务

31+阅读 · 2020年1月11日

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

专知会员服务

22+阅读 · 2019年12月6日

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

专知会员服务

218+阅读 · 2019年10月18日

异常检测（Anomaly Detection）综述

异常检测（Anomaly Detection）综述

极市平台

20+阅读 · 2020年10月24日

异常检测怎么做，试试孤立随机森林算法（附代码）

异常检测怎么做，试试孤立随机森林算法（附代码）

机器之心

16+阅读 · 2020年3月15日

异常检测论文大列表：方法、应用、综述

异常检测论文大列表：方法、应用、综述

专知

126+阅读 · 2019年7月15日

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

全球人工智能

13+阅读 · 2019年4月30日

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

专知

137+阅读 · 2019年1月14日

异常检测的阈值，你怎么选？给你整理好了...

异常检测的阈值，你怎么选？给你整理好了...

机器学习算法与Python学习

10+阅读 · 2018年9月19日

数据分析师应该知道的16种回归方法：负二项回归

数据分析师应该知道的16种回归方法：负二项回归

数萃大数据

74+阅读 · 2018年9月16日

干货！一文读懂行人检测算法

干货！一文读懂行人检测算法

全球人工智能

11+阅读 · 2018年5月31日

吴恩达机器学习中文版笔记：异常检测（Anomaly Detection）

吴恩达机器学习中文版笔记：异常检测（Anomaly Detection）

大数据文摘

19+阅读 · 2018年4月29日

侦测欺诈交易（异常点检测）

侦测欺诈交易（异常点检测）

GBASE数据工程部数据团队

20+阅读 · 2017年5月10日

多重假设检验中的k-FWER控制

国家自然科学基金

0+阅读 · 2015年12月31日

半参数回归模型中随机误差分布的检验问题

国家自然科学基金

2+阅读 · 2015年12月31日

高维半参数模型假设检验问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

高通量测序的可计算建模与应用基础算法

国家自然科学基金

1+阅读 · 2015年12月31日

复杂公共环境下群体行为尺度自适应建模与特定异常行为识别算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

高维数据下多样本均值检验问题的研究

国家自然科学基金

0+阅读 · 2015年12月31日

试验设计中的模型选择

国家自然科学基金

6+阅读 · 2014年12月31日

一般半群和广义正则半群的代数理论

国家自然科学基金

0+阅读 · 2014年12月31日

多重比较中控制FDR的有效检验方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于似然函数的统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

Tuning Out-of-Distribution (OOD) Detectors Without Given OOD Data

Tuning Out-of-Distribution (OOD) Detectors Without Given OOD Data

Arxiv

0+阅读 · 2月5日

Accurate and Efficient Approximation of the Null Distribution of Rao's Spacing Test

Arxiv

0+阅读 · 2月4日

Towards Quantum Universal Hypothesis Testing

Arxiv

0+阅读 · 2月3日

Collective Outlier Detection and Enumeration with Conformalized Closed Testing

Arxiv

0+阅读 · 1月14日

Nonparametric inference for ratios of densities via uniformly valid and powerful permutation tests

Arxiv

0+阅读 · 1月13日

Novel Decoding Algorithm for Noiseless Non-Adaptive Group Testing

Arxiv

0+阅读 · 1月12日

A complete characterization of testable hypotheses

Arxiv

0+阅读 · 1月8日

Exponentially Consistent Low Complexity Tests for Outlier Hypothesis Testing

Arxiv

0+阅读 · 1月8日

A novel finite-sample testing procedure for composite null hypotheses via pointwise rejection

Arxiv

0+阅读 · 1月5日

Multiple Testing of One-Sided Hypotheses with Conservative $p$-values

Arxiv

0+阅读 · 2025年12月31日

VIP会员

文章信息

相关主题

相关VIP内容

GPT-4V在异常检测表现如何？通用异常检测新曙光：华科大等揭秘GPT-4V的全方位异常检测表现

GPT-4V在异常检测表现如何？通用异常检测新曙光：华科大等揭秘GPT-4V的全方位异常检测表现

专知会员服务

39+阅读 · 2023年11月11日

《基于高斯混合流和入包的异常检测》2023最新57页论文

《基于高斯混合流和入包的异常检测》2023最新57页论文

专知会员服务

28+阅读 · 2023年5月15日

【剑桥大学博士论文】模型不确定性下的统计假设检验，198页pdf

【剑桥大学博士论文】模型不确定性下的统计假设检验，198页pdf

专知会员服务

26+阅读 · 2023年2月7日

索邦大学121页博士论文《时间序列中的无监督异常检测》

索邦大学121页博士论文《时间序列中的无监督异常检测》

专知会员服务

103+阅读 · 2022年7月25日

【AAAI2022】基于图神经网络的统一离群点异常检测方法

【AAAI2022】基于图神经网络的统一离群点异常检测方法

专知会员服务

28+阅读 · 2022年2月12日

NTU最新《广义分布外OOD检测》综述论文，20页pdf阐述离群/异常/新类/开集/分布外检测的异同

NTU最新《广义分布外OOD检测》综述论文，20页pdf阐述离群/异常/新类/开集/分布外检测的异同

专知会员服务

29+阅读 · 2021年10月26日

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【O’Reilly讲座】基于深度学习的异常检测方法用于检测大型数据集的质量：Anomaly detection using deep learning to measure the quality of large datasets

【O’Reilly讲座】基于深度学习的异常检测方法用于检测大型数据集的质量：Anomaly detection using deep learning to measure the quality of large datasets

专知会员服务

31+阅读 · 2020年1月11日

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

【KDD2019|讲座推荐】假设检验与统计声音模式挖掘：Hypothesis Testing and Statistically-sound Pattern Mining

专知会员服务

22+阅读 · 2019年12月6日

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

专知会员服务

218+阅读 · 2019年10月18日

热门VIP内容

开通专知VIP会员享更多权益服务

智能体记忆深度剖析：评价指标与系统局限性的分类体系及实证分析

《可信人工智能赋能系统的支柱》

【CMU博士论文】可靠轨迹预测的分层基石：数据、评估与方法

人工智能赋能边缘与自主系统：美陆军现代化进程聚焦威胁探测与战术边缘情报

相关资讯

异常检测（Anomaly Detection）综述

异常检测（Anomaly Detection）综述

极市平台

20+阅读 · 2020年10月24日

异常检测怎么做，试试孤立随机森林算法（附代码）

异常检测怎么做，试试孤立随机森林算法（附代码）

机器之心

16+阅读 · 2020年3月15日

异常检测论文大列表：方法、应用、综述

异常检测论文大列表：方法、应用、综述

专知

126+阅读 · 2019年7月15日

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

北大、清华、微软联合提出RepPoints，比边界框更好用的目标检测方法

全球人工智能

13+阅读 · 2019年4月30日

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

最新49页《深度学习异常检测综述》论文，带你全面了解深度学习异常检测方法

专知

137+阅读 · 2019年1月14日

异常检测的阈值，你怎么选？给你整理好了...

异常检测的阈值，你怎么选？给你整理好了...

机器学习算法与Python学习

10+阅读 · 2018年9月19日

数据分析师应该知道的16种回归方法：负二项回归

数据分析师应该知道的16种回归方法：负二项回归

数萃大数据

74+阅读 · 2018年9月16日

干货！一文读懂行人检测算法

干货！一文读懂行人检测算法

全球人工智能

11+阅读 · 2018年5月31日

吴恩达机器学习中文版笔记：异常检测（Anomaly Detection）

吴恩达机器学习中文版笔记：异常检测（Anomaly Detection）

大数据文摘

19+阅读 · 2018年4月29日

侦测欺诈交易（异常点检测）

侦测欺诈交易（异常点检测）

GBASE数据工程部数据团队

20+阅读 · 2017年5月10日

相关论文

Tuning Out-of-Distribution (OOD) Detectors Without Given OOD Data

Tuning Out-of-Distribution (OOD) Detectors Without Given OOD Data

Arxiv

0+阅读 · 2月5日

Accurate and Efficient Approximation of the Null Distribution of Rao's Spacing Test

Arxiv

0+阅读 · 2月4日

Towards Quantum Universal Hypothesis Testing

Arxiv

0+阅读 · 2月3日

Collective Outlier Detection and Enumeration with Conformalized Closed Testing

Arxiv

0+阅读 · 1月14日

Nonparametric inference for ratios of densities via uniformly valid and powerful permutation tests

Arxiv

0+阅读 · 1月13日

Novel Decoding Algorithm for Noiseless Non-Adaptive Group Testing

Arxiv

0+阅读 · 1月12日

A complete characterization of testable hypotheses

Arxiv

0+阅读 · 1月8日

Exponentially Consistent Low Complexity Tests for Outlier Hypothesis Testing

Arxiv

0+阅读 · 1月8日

A novel finite-sample testing procedure for composite null hypotheses via pointwise rejection

Arxiv

0+阅读 · 1月5日

Multiple Testing of One-Sided Hypotheses with Conservative $p$-values

Arxiv

0+阅读 · 2025年12月31日

相关基金

多重假设检验中的k-FWER控制

国家自然科学基金

0+阅读 · 2015年12月31日

半参数回归模型中随机误差分布的检验问题

国家自然科学基金

2+阅读 · 2015年12月31日

高维半参数模型假设检验问题的研究

国家自然科学基金

1+阅读 · 2015年12月31日

高通量测序的可计算建模与应用基础算法

国家自然科学基金

1+阅读 · 2015年12月31日

复杂公共环境下群体行为尺度自适应建模与特定异常行为识别算法研究

国家自然科学基金

1+阅读 · 2015年12月31日

高维数据下多样本均值检验问题的研究

国家自然科学基金

0+阅读 · 2015年12月31日

试验设计中的模型选择

国家自然科学基金

6+阅读 · 2014年12月31日

一般半群和广义正则半群的代数理论

国家自然科学基金

0+阅读 · 2014年12月31日

多重比较中控制FDR的有效检验方法

国家自然科学基金

0+阅读 · 2014年12月31日

基于似然函数的统计推断

国家自然科学基金

5+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员