Assistants, Not Architects: The Role of LLMs in Networked Systems Design

Designing the architecture of modern networked systems requires navigating a large, combinatorial space of hardware, systems, and configuration choices with complex cross-layer interactions. Architects must balance competing objectives such as performance, cost, and deployability while satisfying compatibility and resource constraints, often relying on scattered rules-of-thumb drawn from benchmarks, papers, documentation, and expert experience. This raises a natural question: can large language models (LLMs) reliably perform this kind of architectural reasoning? We find that they cannot. While LLMs produce plausible configurations, they frequently miss critical constraints, encode incorrect assumptions, and exhibit ``stickiness'' to familiar patterns. A natural workaround--iterative validation via simulation or experimentation--is often prohibitively expensive at scale and, in many cases, infeasible, particularly when comparing hardware-dependent alternatives. Motivated by this gap, we present Kepler, a lightweight reasoning framework for architecture design that combines structured, expert-driven specifications with SMT-based optimization. Kepler encodes architecturally significant properties--requirements, incompatibilities, and qualitative trade-offs--about systems, hardware, and workloads as constraints, and synthesizes feasible designs that optimize user-defined objectives. It operates at an abstract level, capturing ``rules-of-thumb'' rather than detailed system behavior, enabling tractable reasoning while preserving key interactions, and provides explanations for its decisions. Through experiments and case studies, we show that Kepler uncovers interactions missed by LLMs and supports systematic, explainable design exploration.

翻译：设计现代网络系统的架构需要在硬件、系统和配置选择的大型组合空间中导航，同时处理复杂的跨层交互。架构师必须在性能、成本和可部署性等相互冲突的目标之间取得平衡，并满足兼容性和资源约束，通常依赖从基准测试、论文、文档和专家经验中提炼的零散经验法则。这自然引发了一个问题：大语言模型（LLM）能否可靠地执行此类架构推理？我们发现：不能。虽然LLM能生成看似合理的配置，但经常遗漏关键约束、编码不正确的假设，并对熟悉模式表现出“粘性”。一种自然的替代方案——通过模拟或实验进行迭代验证——往往在规模上成本过高，且在许多情况下不可行，尤其是在比较依赖硬件的备选方案时。受此差距启发，我们提出Kepler——一种用于架构设计的轻量级推理框架，它将结构化、专家驱动的规范与基于SMT的优化相结合。Kepler将关于系统、硬件和工作负载的架构重要属性（需求、不兼容性和定性权衡）编码为约束，并综合出优化用户定义目标的可行设计方案。它在抽象层面运行，捕捉“经验法则”而非详细系统行为，从而在保留关键交互的同时实现可处理的推理，并为其决策提供解释。通过实验和案例研究，我们证明Kepler能发现LLM遗漏的交互，并支持系统化、可解释的设计探索。