How does SSD Cluster Perform for Distributed File Systems: An Empirical Study - 专知论文

会员服务 ·

0

簇 · Performer · SSD · 可约的 · Extensibility ·

2023 年 3 月 22 日

How does SSD Cluster Perform for Distributed File Systems: An Empirical Study

翻译：SSD集群如何为分布式文件系统提供性能：一项实证研究

Jiashu Wu,Yang Wang,Jinpeng Wang,Hekang Wang,Taorui Lin

from arxiv, Accepted by Concurrency and Computation: Practice and Experience

As the capacity of Solid-State Drives (SSDs) is constantly being optimised and boosted with gradually reduced cost, the SSD cluster is now widely deployed as part of the hybrid storage system in various scenarios such as cloud computing and big data processing. However, despite its rapid developments, the performance of the SSD cluster remains largely under-investigated, leaving its sub-optimal applications in reality. To address this issue, in this paper we conduct extensive empirical studies for a comprehensive understanding of the SSD cluster in diverse settings. To this end, we configure a real SSD cluster and gather the generated trace data based on some often-used benchmarks, then adopt analytical methods to analyse the performance of the SSD cluster with different configurations. In particular, regression models are built to provide better performance predictability under broader configurations, and the correlations between influential factors and performance metrics with respect to different numbers of nodes are investigated, which reveal the high scalability of the SSD cluster. Additionally, the cluster's network bandwidth is inspected to explain the performance bottleneck. Finally, the knowledge gained is summarised to benefit the SSD cluster deployment in practice.

翻译：随着固态硬盘（SSD）容量持续优化提升且成本逐渐降低，SSD集群已广泛部署为混合存储系统的一部分，应用于云计算和大数据处理等多种场景。然而，尽管其发展迅速，SSD集群的性能仍缺乏深入研究，导致其在实际应用中存在次优部署的问题。针对这一现状，本文通过广泛的实证研究，全面理解不同配置下SSD集群的性能表现。为此，我们构建了一个真实的SSD集群，基于常用基准测试收集生成的追踪数据，并采用分析方法评估不同配置下SSD集群的性能。具体而言，我们构建回归模型以在更广泛的配置下提供更优的性能可预测性，同时探究不同节点数量下影响因素与性能指标之间的相关性，揭示了SSD集群的高可扩展性。此外，我们检查了集群的网络带宽以解释性能瓶颈。最后，总结所获得的经验知识以指导SSD集群的实际部署。

0

相关内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

254+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

模糊认知集群优化的聚类算法

国家自然科学基金

9+阅读 · 2015年12月31日

锰对认知功能的影响及其分子作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

云计算中虚拟资源性能度量的指标和方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于要素逐步添加GERT网络的非常规突发事件"情景-应对"策略作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

分布式系统预测控制的协调策略与系统综合

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU性能模型的异构系统优化技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

超多核处理器片上网络性能模型研究

国家自然科学基金

1+阅读 · 2010年12月31日

支持QoS的主动无线传感器网络中间件研究

国家自然科学基金

0+阅读 · 2009年12月31日

交通行为非均衡演化与干预对策研究

国家自然科学基金

0+阅读 · 2008年12月31日

Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

Arxiv

0+阅读 · 2023年5月12日

Efficient Discovery of Heterogeneous Quantile Treatment Effects in Randomized Experiments via Anomalous Pattern Detection

Arxiv

0+阅读 · 2023年5月10日

An Empirical Study on How the Developers Discussed about Pandas Topics

Arxiv

0+阅读 · 2023年5月10日

Cross-Study Replicability in Cluster Analysis

Arxiv

0+阅读 · 2023年5月9日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

VIP会员

文章信息

相关主题

最新内容

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

2+阅读 · 今天11:43

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

2+阅读 · 今天11:41

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

5+阅读 · 今天6:30

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

5+阅读 · 今天6:18

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

6+阅读 · 今天6:08

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

6+阅读 · 今天5:54

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

7+阅读 · 今天5:22

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

7+阅读 · 今天5:15

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

7+阅读 · 今天3:42

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

5+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

7+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

10+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

9+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

7+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

9+阅读 · 6月24日

相关VIP内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

复杂的序列数据分析：现有算法的系统文献综述，Complex Sequential Data Analysis: A Systematic Literature Review of Existing Algorithms

专知会员服务

27+阅读 · 2020年7月24日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

254+阅读 · 2020年4月19日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

【SIGMOD2020-CMU】在内存中搜索树的顺序保持键压缩，Order-Preserving Key Compression for In-Memory Search Trees

专知会员服务

15+阅读 · 2020年3月7日

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

网络流量监测与分析大数据综述，A Survey on Big Data for Network Traffic Monitoring and Analysis

专知会员服务

65+阅读 · 2020年3月5日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

网状网络及其在军事领域的运用

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

Arxiv

0+阅读 · 2023年5月12日

Efficient Discovery of Heterogeneous Quantile Treatment Effects in Randomized Experiments via Anomalous Pattern Detection

Arxiv

0+阅读 · 2023年5月10日

An Empirical Study on How the Developers Discussed about Pandas Topics

Arxiv

0+阅读 · 2023年5月10日

Cross-Study Replicability in Cluster Analysis

Arxiv

0+阅读 · 2023年5月9日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Meta-Transfer Learning for Zero-Shot Super-Resolution

Arxiv

43+阅读 · 2020年2月27日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

相关基金

模糊认知集群优化的聚类算法

国家自然科学基金

9+阅读 · 2015年12月31日

锰对认知功能的影响及其分子作用机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

云计算中虚拟资源性能度量的指标和方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于要素逐步添加GERT网络的非常规突发事件"情景-应对"策略作用机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

分布式系统预测控制的协调策略与系统综合

国家自然科学基金

0+阅读 · 2013年12月31日

基于GPU性能模型的异构系统优化技术研究

国家自然科学基金

0+阅读 · 2011年12月31日

超多核处理器片上网络性能模型研究

国家自然科学基金

1+阅读 · 2010年12月31日

支持QoS的主动无线传感器网络中间件研究

国家自然科学基金

0+阅读 · 2009年12月31日

交通行为非均衡演化与干预对策研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员