Flock: Accurate network fault localization at scale - 专知论文

会员服务 ·

0

Networking · 模型评估 · 推断 · 概率图模型 · 缩放 ·

2023 年 5 月 5 日

Flock: Accurate network fault localization at scale

翻译：Flock：大规模网络故障精确定位

Vipul Harsh,Tong Meng,Kapil Agrawal,P. Brighten Godfrey

from arxiv, To appear in ACM PACMNET, Vol 1, June 2023

Inferring the root cause of failures among thousands of components in a data center network is challenging, especially for "gray" failures that are not reported directly by switches. Faults can be localized through end-to-end measurements, but past localization schemes are either too slow for large-scale networks or sacrifice accuracy. We describe Flock, a network fault localization algorithm and system that achieves both high accuracy and speed at datacenter scale. Flock uses a probabilistic graphical model (PGM) to achieve high accuracy, coupled with new techniques to dramatically accelerate inference in discrete-valued Bayesian PGMs. Large-scale simulations and experiments in a hardware testbed show Flock speeds up inference by >10000x compared to past PGM methods, and improves accuracy over the best previous datacenter fault localization approaches, reducing inference error by 1.19-11x on the same input telemetry, and by 1.2-55x after incorporating passive telemetry. We also prove Flock's inference is optimal in restricted settings

翻译：在数据中心网络成千上万个组件中推断故障根因极具挑战性，尤其对于交换机无法直接报告的"灰色"故障。通过端到端测量可定位故障，但现有方案在大规模网络中速度过慢或牺牲精度。本文描述Flock——一种在数据中心规模同时实现高精度与高速率的网络故障定位算法及系统。Flock采用概率图模型（PGM）实现高精度，并融合新技术大幅加速离散值贝叶斯PGM的推理过程。大规模仿真与硬件试验台实验表明：相比以往PGM方法，Flock的推理速度提升超10000倍；相较于最佳现有数据中心故障定位方案，Flock将相同输入遥测数据的推理误差降低1.19-11倍，引入被动遥测数据后可降低1.2-55倍。我们还证明Flock的推理在受限场景下具有最优性。

0

相关内容

Networking

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICDAR2019教程】模式识别和文档图像中基于图的方法，Graph-based Methods in Pattern Recognition and Document Image Analysis

【ICDAR2019教程】模式识别和文档图像中基于图的方法，Graph-based Methods in Pattern Recognition and Document Image Analysis

专知会员服务

30+阅读 · 2019年9月20日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

受体相互作用蛋白3（RIP3）促进I型干扰素分泌的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

用同步辐射技术研究海盐颗粒物吸湿特征及其表面异相反应机制

国家自然科学基金

0+阅读 · 2013年12月31日

土壤养分与重金属污染物的LIBS快速定量分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

DEM构建的多面函数抗差插值算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

长江口深水航道三维异重流分布特性和湍流混合机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

石化区VOCs无组织污染的高时空分辨率预测技术方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

草鱼肠道抗菌肽A基因转录调控的分子机理

国家自然科学基金

0+阅读 · 2011年12月31日

蒙特卡罗模拟和偏最小二乘法（MC-PLS）的结合: 新的快速的定量X-ray mapping分析方法

国家自然科学基金

0+阅读 · 2009年12月31日

基于LIDAR点云重建高精度DEM的关键技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

Attention Hybrid Variational Net for Accelerated MRI Reconstruction

Arxiv

0+阅读 · 2023年6月21日

MutateNN: Mutation Testing of Image Recognition Models Deployed on Hardware Accelerators

Arxiv

0+阅读 · 2023年6月21日

Understanding human mobility patterns in Chicago: an analysis of taxi data using clustering techniques

Arxiv

0+阅读 · 2023年6月21日

StabJGL: a stability approach to sparsity and similarity selection in multiple network reconstruction

Arxiv

0+阅读 · 2023年6月20日

Differentially Private Histogram, Predecessor, and Set Cardinality under Continual Observation

Arxiv

0+阅读 · 2023年6月17日

Adversaries with Limited Information in the Friedkin--Johnsen Model

Arxiv

0+阅读 · 2023年6月17日

Learning High-Dimensional Nonparametric Differential Equations via Multivariate Occupation Kernel Functions

Arxiv

0+阅读 · 2023年6月16日

Transforming Observations of Ocean Temperature with a Deep Convolutional Residual Regressive Neural Network

Transforming Observations of Ocean Temperature with a Deep Convolutional Residual Regressive Neural Network

Arxiv

0+阅读 · 2023年6月16日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Graph Convolutional Neural Networks for Web-Scale Recommender Systems

Arxiv

14+阅读 · 2018年6月6日

VIP会员

文章信息

相关主题

概率图模型

最新内容

定向能反无人机系统最新发展动态

定向能反无人机系统最新发展动态

专知会员服务

0+阅读 · 21分钟前

从燃煤战舰到算法战争：水面指挥的永恒要求

从燃煤战舰到算法战争：水面指挥的永恒要求

专知会员服务

0+阅读 · 38分钟前

《短程弹道再入飞行器拦截时间中的一项异常现象》

《短程弹道再入飞行器拦截时间中的一项异常现象》

专知会员服务

0+阅读 · 41分钟前

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

专知会员服务

0+阅读 · 43分钟前

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

专知会员服务

0+阅读 · 58分钟前

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

《韩国国防政策与军备出口：韩国安全与国防政策如何塑造其国防工业与军备出口格局》最新100页报告

专知会员服务

0+阅读 · 今天13:10

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

ICML 2026 | VOTP：用视频基础模型与最优传输，让离线偏好强化学习只需少量反馈

专知会员服务

5+阅读 · 6月16日

多模态代码智能综述：从视觉输入到可执行代码系统

多模态代码智能综述：从视觉输入到可执行代码系统

专知会员服务

7+阅读 · 6月16日

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

美国马六甲“三重网”概念：安全网、威慑网与杀伤网

专知会员服务

5+阅读 · 6月16日

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

《面向导弹有效发射时机的监督机器学习方法：基于超视距空战仿真》

专知会员服务

5+阅读 · 6月16日

《通用大语言模型：无人机指挥与控制接口》最新40页

《通用大语言模型：无人机指挥与控制接口》最新40页

专知会员服务

15+阅读 · 6月16日

《通过小型无人机系统将情报能力“作战化”》

《通过小型无人机系统将情报能力“作战化”》

专知会员服务

6+阅读 · 6月16日

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

《神经安全型有人–无人协同：面向认知自适应作战能力的参考架构》

专知会员服务

10+阅读 · 6月16日

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

《在指挥链中通过多准则决策分析传达指挥官意图：空战实验》

专知会员服务

21+阅读 · 6月15日

消耗优势：美军的“精确规模化”概念

消耗优势：美军的“精确规模化”概念

专知会员服务

8+阅读 · 6月15日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICDAR2019教程】模式识别和文档图像中基于图的方法，Graph-based Methods in Pattern Recognition and Document Image Analysis

【ICDAR2019教程】模式识别和文档图像中基于图的方法，Graph-based Methods in Pattern Recognition and Document Image Analysis

专知会员服务

30+阅读 · 2019年9月20日

热门VIP内容

开通专知VIP会员享更多权益服务

《短程弹道再入飞行器拦截时间中的一项异常现象》

美智库《战术级指挥控制的迫切要求：构建弹性机动式指挥控制网络》报告

从燃煤战舰到算法战争：水面指挥的永恒要求

《基于回归方法与任务上下文的对抗环境动态战术网络报文优先级排序》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

可解释的CNN

可解释的CNN

CreateAMind

18+阅读 · 2017年10月5日

相关论文

Attention Hybrid Variational Net for Accelerated MRI Reconstruction

Arxiv

0+阅读 · 2023年6月21日

MutateNN: Mutation Testing of Image Recognition Models Deployed on Hardware Accelerators

Arxiv

0+阅读 · 2023年6月21日

Understanding human mobility patterns in Chicago: an analysis of taxi data using clustering techniques

Arxiv

0+阅读 · 2023年6月21日

StabJGL: a stability approach to sparsity and similarity selection in multiple network reconstruction

Arxiv

0+阅读 · 2023年6月20日

Differentially Private Histogram, Predecessor, and Set Cardinality under Continual Observation

Arxiv

0+阅读 · 2023年6月17日

Adversaries with Limited Information in the Friedkin--Johnsen Model

Arxiv

0+阅读 · 2023年6月17日

Learning High-Dimensional Nonparametric Differential Equations via Multivariate Occupation Kernel Functions

Arxiv

0+阅读 · 2023年6月16日

Transforming Observations of Ocean Temperature with a Deep Convolutional Residual Regressive Neural Network

Transforming Observations of Ocean Temperature with a Deep Convolutional Residual Regressive Neural Network

Arxiv

0+阅读 · 2023年6月16日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Graph Convolutional Neural Networks for Web-Scale Recommender Systems

Arxiv

14+阅读 · 2018年6月6日

相关基金

受体相互作用蛋白3（RIP3）促进I型干扰素分泌的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

用同步辐射技术研究海盐颗粒物吸湿特征及其表面异相反应机制

国家自然科学基金

0+阅读 · 2013年12月31日

土壤养分与重金属污染物的LIBS快速定量分析方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

DEM构建的多面函数抗差插值算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

长江口深水航道三维异重流分布特性和湍流混合机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

石化区VOCs无组织污染的高时空分辨率预测技术方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

草鱼肠道抗菌肽A基因转录调控的分子机理

国家自然科学基金

0+阅读 · 2011年12月31日

蒙特卡罗模拟和偏最小二乘法（MC-PLS）的结合: 新的快速的定量X-ray mapping分析方法

国家自然科学基金

0+阅读 · 2009年12月31日

基于LIDAR点云重建高精度DEM的关键技术研究

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员