Cluster workload allocation often requires complex configurations, creating a usability gap. This paper introduces a semantic, intent-driven scheduling paradigm for cluster systems using Natural Language Processing. The system employs a Large Language Model (LLM) integrated via a Kubernetes scheduler extender to interpret natural language allocation hint annotations for soft affinity preferences. A prototype featuring a cluster state cache and an intent analyzer (using AWS Bedrock) was developed. Empirical evaluation demonstrated high LLM parsing accuracy (>95% Subset Accuracy on an evaluation ground-truth dataset) for top-tier models like Amazon Nova Pro/Premier and Mistral Pixtral Large, significantly outperforming a baseline engine. Scheduling quality tests across six scenarios showed the prototype achieved superior or equivalent placement compared to standard Kubernetes configurations, particularly excelling in complex and quantitative scenarios and handling conflicting soft preferences. The results validate using LLMs for accessible scheduling but highlight limitations like synchronous LLM latency, suggesting asynchronous processing for production readiness. This work confirms the viability of semantic soft affinity for simplifying workload orchestration.
翻译:集群工作负载分配通常需要复杂的配置,这造成了可用性差距。本文提出了一种利用自然语言处理、面向集群系统的语义化、意图驱动的调度范式。该系统通过集成大型语言模型(LLM)的 Kubernetes 调度器扩展程序,来解析用于表达软亲和性偏好的自然语言分配提示注解。我们开发了一个原型,该原型包含集群状态缓存和一个意图分析器(使用 AWS Bedrock)。实证评估表明,对于 Amazon Nova Pro/Premier 和 Mistral Pixtral Large 等顶级模型,LLM 解析准确率很高(在评估基准数据集上的子集准确率 >95%),显著优于基线引擎。在六种场景下的调度质量测试显示,与标准 Kubernetes 配置相比,该原型实现了更优或同等的放置效果,尤其在复杂和定量场景以及处理冲突的软偏好方面表现出色。结果验证了使用 LLM 实现可访问性调度的可行性,但也指出了同步 LLM 延迟等局限性,建议采用异步处理以实现生产就绪。本工作证实了语义软亲和性在简化工作负载编排方面的可行性。