Solid State Drives (SSDs) are critical to datacenters, consumer platforms, and mission-critical systems. Yet diagnosing their performance and reliability is difficult because data are fragmented and time-disjoint, and existing methods demand large datasets and expert input while offering only limited insights. Degradation arises not only from shifting workloads and evolving architectures but also from environmental factors such as temperature, humidity, and vibration. We present KORAL, a knowledge driven reasoning framework that integrates Large Language Models (LLMs) with a structured Knowledge Graph (KG) to generate insights into SSD operations. Unlike traditional approaches that require extensive expert input and large datasets, KORAL generates a Data KG from fragmented telemetry and integrates a Literature KG that already organizes knowledge from literature, reports, and traces. This turns unstructured sources into a queryable graph and telemetry into structured knowledge, and both the Graphs guide the LLM to deliver evidence-based, explainable analysis aligned with the domain vocabulary and constraints. Evaluation using real production traces shows that the KORAL delivers expert-level diagnosis and recommendations, supported by grounded explanations that improve reasoning transparency, guide operator decisions, reduce manual effort, and provide actionable insights to improve service quality. To our knowledge, this is the first end-to-end system that combines LLMs and KGs for full-spectrum SSD reasoning including Descriptive, Predictive, Prescriptive, and What-if analysis. We release the generated SSD-specific KG to advance reproducible research in knowledge-based storage system analysis. GitHub Repository: https://github.com/Damrl-lab/KORAL
翻译:固态硬盘(SSD)对数据中心、消费级平台及关键任务系统至关重要。然而,由于其数据具有碎片化与时间离散特性,诊断其性能与可靠性十分困难;现有方法不仅需要大规模数据集与专家输入,且仅能提供有限的洞察。性能退化不仅源于工作负载变化与架构演进,还受到温度、湿度、振动等环境因素的影响。本文提出KORAL——一种知识驱动的推理框架,其通过将大语言模型(LLMs)与结构化知识图谱(KG)相结合,生成对SSD运行状态的深度洞察。与传统方法需要大量专家输入和大规模数据集不同,KORAL能够从碎片化的遥测数据中构建数据知识图谱,并整合已系统化组织文献、报告与追踪记录的知识图谱。该框架将非结构化数据源转化为可查询的图谱,并将遥测数据转化为结构化知识;两类图谱共同引导大语言模型,基于证据生成可解释的分析,且分析过程符合领域术语与约束条件。基于真实生产环境追踪数据的评估表明,KORAL能够提供专家级的诊断与建议,其支撑性解释提升了推理透明度,辅助运维决策,减少人工工作量,并为提升服务质量提供可操作的见解。据我们所知,这是首个将大语言模型与知识图谱相结合、实现涵盖描述性、预测性、指导性及假设性分析的全谱SSD推理的端到端系统。我们公开生成的SSD专用知识图谱,以推动基于知识的存储系统分析的可复现研究。GitHub仓库:https://github.com/Damrl-lab/KORAL