Relative Error Streaming Quantiles - 专知论文

会员服务 ·

0

流 · 近似 · 秩 · 可理解性 · DATE ·

2021 年 5 月 28 日

Relative Error Streaming Quantiles

翻译：相对错误流出量数

Graham Cormode,Zohar Karnin,Edo Liberty,Justin Thaler,Pavel Veselý

from arxiv, Full version of the paper to appear in PODS 2021. 46 pages, 2 figures

Approximating ranks, quantiles, and distributions over streaming data is a central task in data analysis and monitoring. Given a stream of $n$ items from a data universe $\mathcal{U}$ equipped with a total order, the task is to compute a sketch (data structure) of size poly$(\log(n), 1/\varepsilon)$. Given the sketch and a query item $y \in \mathcal{U}$, one should be able to approximate its rank in the stream, i.e., the number of stream elements smaller than or equal to $y$. Most works to date focused on additive $\varepsilon n$ error approximation, culminating in the KLL sketch that achieved optimal asymptotic behavior. This paper investigates multiplicative $(1\pm\varepsilon)$-error approximations to the rank. Practical motivation for multiplicative error stems from demands to understand the tails of distributions, and hence for sketches to be more accurate near extreme values. The most space-efficient algorithms due to prior work store either $O(\log(\varepsilon^2 n)/\varepsilon^2)$ or $O(\log^3(\varepsilon n)/\varepsilon)$ universe items. This paper presents a randomized algorithm storing $O(\log^{1.5}(\varepsilon n)/\varepsilon)$ items, which is within an $O(\sqrt{\log(\varepsilon n)})$ factor of optimal. The algorithm does not require prior knowledge of the stream length and is fully mergeable, rendering it suitable for parallel and distributed computing environments.

翻译：在数据分析和监测中, 排序、量级和流数据分布是一项核心任务。如果来自数据宇宙的 $\ mathcal{U} 以总顺序配置的 $\ mathcal{U} $, 任务在于计算一个大小( log( n), 1/\ varepsilon) $ 的草图( 数据结构) 。鉴于草图和查询项 $y \ mathcal{U}, 一个人应该能够大约其在流中的位置, 也就是说, 流要素小于或等于$$。大部分工作到日期的重心都集中在 $\ varepal= n$ 错误的添加 $\ vol, 最终的 KLL 草图( 数据结构) 达到最佳的亚值行为。本文调查的是多复制$(1\ pm\ varepsil) $- orororborization 。多复制错误的实际动机来自了解发行的( liveralalralalal) ral ral ral2) nqal= dal= dalslus= dalmaxx

0

相关内容

【南京大学】量子计算 (Spring 2021)课程

专知会员服务

59+阅读 · 2021年4月12日

应用机器学习书稿，361页pdf

应用机器学习书稿，361页pdf

专知会员服务

59+阅读 · 2020年11月24日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

专知会员服务

46+阅读 · 2020年2月11日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

282+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【关关的刷题日记59】Leetcode 257 Binary Tree Paths

【关关的刷题日记59】Leetcode 257 Binary Tree Paths

专知

3+阅读 · 2017年12月7日

【关关的刷题日记56】Leetcode 107 Binary Tree Level Order Traversal II

【关关的刷题日记56】Leetcode 107 Binary Tree Level Order Traversal II

专知

4+阅读 · 2017年12月4日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Structural properties of biclique graphs and the distance formula

Structural properties of biclique graphs and the distance formula

Arxiv

0+阅读 · 2021年7月20日

Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time

Arxiv

0+阅读 · 2021年7月19日

Parallel Weighted Random Sampling

Parallel Weighted Random Sampling

Arxiv

0+阅读 · 2021年7月19日

The complexity of approximating averages on bounded-degree graphs

Arxiv

0+阅读 · 2021年7月19日

A Bayesian Hierarchical Score for Structure Learning from Related Data Sets

Arxiv

0+阅读 · 2021年7月17日

On Two-Pass Streaming Algorithms for Maximum Bipartite Matching

Arxiv

0+阅读 · 2021年7月16日

Tight Bounds for Approximate Near Neighbor Searching for Time Series under the Fréchet Distance

Arxiv

0+阅读 · 2021年7月16日

Streaming and Distributed Algorithms for Robust Column Subset Selection

Arxiv

0+阅读 · 2021年7月16日

Correlation detection in trees for partial graph alignment

Arxiv

0+阅读 · 2021年7月15日

Optimal Stopping Methodology for the Secretary Problem with Random Queries

Arxiv

0+阅读 · 2021年7月15日

VIP会员

文章信息

相关主题

最新内容

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

专知会员服务

6+阅读 · 今天2:06

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

专知会员服务

5+阅读 · 今天1:37

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

专知会员服务

3+阅读 · 6月17日

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

专知会员服务

5+阅读 · 6月17日

学习数据的几何：形状空间分析数学综述

学习数据的几何：形状空间分析数学综述

专知会员服务

4+阅读 · 6月17日

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

《现代防空系统综述：架构、传感器、拦截器及新兴威胁环境对基础设施受限防御环境的影响》2026最新长综述

专知会员服务

6+阅读 · 6月17日

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

7+阅读 · 6月17日

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

4+阅读 · 6月17日

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

6+阅读 · 6月17日

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

6+阅读 · 6月17日

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

5+阅读 · 6月17日

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

专知会员服务

4+阅读 · 6月17日

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

专知会员服务

6+阅读 · 6月16日

多模态代码智能综述：从视觉输入到可执行代码系统

多模态代码智能综述：从视觉输入到可执行代码系统

专知会员服务

8+阅读 · 6月16日

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

6+阅读 · 6月16日

相关VIP内容

【南京大学】量子计算 (Spring 2021)课程

专知会员服务

59+阅读 · 2021年4月12日

应用机器学习书稿，361页pdf

应用机器学习书稿，361页pdf

专知会员服务

59+阅读 · 2020年11月24日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

82+阅读 · 2020年2月27日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

【新书】数字图像处理手册第二版，Handbook of Mathematical Methods in Imaging, 2nd edition

专知会员服务

46+阅读 · 2020年2月11日

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

【Google】神经架构搜索（Neural Architecture Search and Beyond），Barret Zoph

专知会员服务

31+阅读 · 2019年11月25日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

282+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《面向反无人机作战的联邦式可解释射频–光电/红外情报融合：边缘人工智能优化、电子战韧性及分布式监视验证》

【伯克利博士论文】迈向可扩展与自我演进的大语言模型智能体

《廉价自杀式无人机战争的军事战略影响：乌克兰和伊朗案例研究》

ICML 2026 | FR3D：解耦自车运动的未来动态三维重建世界模型

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【关关的刷题日记59】Leetcode 257 Binary Tree Paths

【关关的刷题日记59】Leetcode 257 Binary Tree Paths

专知

3+阅读 · 2017年12月7日

【关关的刷题日记56】Leetcode 107 Binary Tree Level Order Traversal II

【关关的刷题日记56】Leetcode 107 Binary Tree Level Order Traversal II

专知

4+阅读 · 2017年12月4日

【计算机类】期刊专刊/国际会议截稿信息6条

【计算机类】期刊专刊/国际会议截稿信息6条

Call4Papers

3+阅读 · 2017年10月13日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Structural properties of biclique graphs and the distance formula

Structural properties of biclique graphs and the distance formula

Arxiv

0+阅读 · 2021年7月20日

Towards a Decomposition-Optimal Algorithm for Counting and Sampling Arbitrary Motifs in Sublinear Time

Arxiv

0+阅读 · 2021年7月19日

Parallel Weighted Random Sampling

Parallel Weighted Random Sampling

Arxiv

0+阅读 · 2021年7月19日

The complexity of approximating averages on bounded-degree graphs

Arxiv

0+阅读 · 2021年7月19日

A Bayesian Hierarchical Score for Structure Learning from Related Data Sets

Arxiv

0+阅读 · 2021年7月17日

On Two-Pass Streaming Algorithms for Maximum Bipartite Matching

Arxiv

0+阅读 · 2021年7月16日

Tight Bounds for Approximate Near Neighbor Searching for Time Series under the Fréchet Distance

Arxiv

0+阅读 · 2021年7月16日

Streaming and Distributed Algorithms for Robust Column Subset Selection

Arxiv

0+阅读 · 2021年7月16日

Correlation detection in trees for partial graph alignment

Arxiv

0+阅读 · 2021年7月15日

Optimal Stopping Methodology for the Secretary Problem with Random Queries

Arxiv

0+阅读 · 2021年7月15日

微信扫码咨询专知VIP会员