Strengthening Human-Centric Chain-of-Thought Reasoning Integrity in LLMs via a Structured Prompt Framework

Chain-of-Thought (CoT) prompting has been used to enhance the reasoning capability of LLMs. However, its reliability in security-sensitive analytical tasks remains insufficiently examined, particularly under structured human evaluation. Alternative approaches, such as model scaling and fine-tuning can be used to help improve performance. These methods are also often costly, computationally intensive, or difficult to audit. In contrast, prompt engineering provides a lightweight, transparent, and controllable mechanism for guiding LLM reasoning. This study proposes a structured prompt engineering framework designed to strengthen CoT reasoning integrity while improving security threat and attack detection reliability in local LLM deployments. The framework includes 16 factors grouped into four core dimensions: (1) Context and Scope Control, (2) Evidence Grounding and Traceability, (3) Reasoning Structure and Cognitive Control, and (4) Security-Specific Analytical Constraints. Rather than optimizing the wording of the prompt heuristically, the framework introduces explicit reasoning controls to mitigate hallucination and prevent reasoning drift, as well as strengthening interpretability in security-sensitive contexts. Using DDoS attack detection in SDN traffic as a case study, multiple model families were evaluated under structured and unstructured prompting conditions. Pareto frontier analysis and ablation experiments demonstrate consistent reasoning improvements (up to 40% in smaller models) and stable accuracy gains across scales. Human evaluation with strong inter-rater agreement (Cohen's k > 0.80) confirms robustness. The results establish structured prompting as an effective and practical approach for reliable and explainable AI-driven cybersecurity analysis.

翻译：链式思维提示已被用于增强大语言模型的推理能力。然而，其在安全敏感分析任务中的可靠性仍未得到充分检验，尤其是在结构化人为评估条件下。模型扩展与微调等替代方法虽有助于提升性能，但往往成本高昂、计算密集且难以审计。相比之下，提示工程为引导大语言模型推理提供了一种轻量级、透明且可控的机制。本研究提出一种结构化提示工程框架，旨在增强链式思维推理完整性，同时提升本地大语言模型部署中安全威胁与攻击检测的可靠性。该框架包含16个因素，划分为四个核心维度：(1) 上下文与范围控制、(2) 证据锚定与可追溯性、(3) 推理结构与认知控制、(4) 安全特定分析约束。该框架并非通过启发式优化提示措辞，而是引入显式推理控制机制以缓解幻觉现象、防止推理偏移，并在安全敏感场景中增强可解释性。以SDN流量中的DDoS攻击检测为案例，在结构化与非结构化提示条件下对多个模型族进行了评估。帕累托前沿分析与消融实验表明，该框架在不同规模模型上均实现了一致的推理改进（小模型提升最高达40%）与稳定的准确率提升。经强评估者间一致性验证（Cohen's k > 0.80）的人为评估证实了其鲁棒性。研究结果确立结构化提示作为实现可靠且可解释的AI驱动网络安全分析的有效实用方法。