Large scale data storage is susceptible to failure. As disks are damaged and replaced, traditional machine learning models, which rely on historical data to make predictions, struggle to accurately predict disk failures. This paper presents a novel method for predicting disk failures by leveraging multi-layer domain adaptive learning techniques. First, disk data with numerous faults is selected as the source domain, and disk data with fewer faults is selected as the target domain. A training of the feature extraction network is performed with the selected origin and destination domains. The contrast between the two domains facilitates the transfer of diagnostic knowledge from the domain of source and target. According to the experimental findings, it has been demonstrated that the proposed technique can generate a reliable prediction model and improve the ability to predict failures on disk data with few failure samples.
翻译:大规模数据存储易受故障影响。随着磁盘损坏与更换,依赖历史数据进行预测的传统机器学习模型难以准确预测磁盘故障。本文提出一种利用多层域自适应学习技术的磁盘故障预测新方法。首先,将故障数据较多的磁盘数据选作源域,故障数据较少的磁盘数据选作目标域;利用所选源域与目标域对特征提取网络进行训练。两域之间的对比促进了诊断知识从源域到目标域的迁移。实验结果表明,所提技术能够生成可靠的预测模型,并提升在故障样本稀少的磁盘数据上的故障预测能力。