Large Language Models (LLMs) are highly sensitive to prompt design, and making optimized prompting techniques is crucial for generating consistent, high-quality outputs. In this study, we introduce COSTAR-A, a novel prompt engineering framework that enhances the existing COSTAR method, which stands for Context, Objective, Style, Tone, Audience, and Response, by adding the 'Answer' component at the end. We demonstrate that while the original COSTAR framework improves prompt clarity and aligns outputs for larger LLMs, its performance is less consistent with smaller, locally optimized models, particularly in tasks that require more directive or constrained outputs. Through a series of controlled prompt-output assessments with smaller (at most 8 billion parameters), fine-tuned models, we found that COSTAR-A can enhance the output structure and decisiveness of localized LLMs for certain tasks, although its effectiveness varies across models and use cases. Notably, the Llama 3.1-8B model exhibited performance improvements when prompted with COSTAR-A compared to COSTAR alone. These findings emphasize the adaptability and scalability of COSTAR-A as a prompting framework, particularly in computationally efficient AI deployments on resource-constrained hardware.
翻译:大语言模型对提示设计高度敏感,制定优化的提示技术对于生成一致、高质量的输出至关重要。本研究提出了COSTAR-A,一种新颖的提示工程框架。该框架通过在原COSTAR方法(代表语境、目标、风格、语气、受众和回应)末尾增加“答案”组件,对现有方法进行了增强。研究表明,虽然原始COSTAR框架能提升提示的清晰度并使较大规模LLM的输出更符合预期,但其在较小的、经过本地优化的模型上表现一致性较差,尤其是在需要更具指导性或约束性输出的任务中。通过对较小规模(最多80亿参数)的微调模型进行一系列受控的提示-输出评估,我们发现COSTAR-A能够针对特定任务增强本地化LLM的输出结构和确定性,尽管其有效性因模型和用例而异。值得注意的是,与单独使用COSTAR相比,Llama 3.1-8B模型在使用COSTAR-A提示时表现出性能提升。这些发现强调了COSTAR-A作为一种提示框架的适应性和可扩展性,特别是在资源受限硬件上进行计算高效的人工智能部署时。