Personas are widely used in software engineering to support requirements elicitation, design, and validation, but their manual creation is costly, time-consuming, and hard to scale. Recent LLM-based approaches automate persona generation from textual data; however, they typically rely on single-shot generation and subjective evaluations, limiting practical reliability. We present PerGent, an industry-grade method for persona generation built around an iterative critique-refinement loop. Specifically, PerGent uses a generator and a critic LLM agent, coordinated by an orchestrator, to iteratively refine personas using external resources such as interviews, surveys, and job postings through a critique-refinement loop with a user-defined maximum number of rounds. We deploy and evaluate PerGent in an industrial setting at Kinaxis, comparing it with three baselines, including one-shot methods. In an expert in-situ evaluation, PerGent achieved the highest expert approval rate (96.9%), exceeding all baselines. We further compare PerGent-generated personas with best-practice personas manually created by domain experts prior to the adoption of LLMs. Compared to baselines, PerGent reproduces a larger proportion of expert content while also contributing substantial new content beyond the pre-LLM personas. We conclude with lessons learned from deploying and evaluating PerGent at Kinaxis.
翻译:人格建模在软件工程中被广泛用于支持需求获取、设计与验证,但其人工创建方法成本高昂、耗时且难以规模化。现有基于大语言模型的方法虽能通过文本数据自动生成人格,却通常依赖单次生成与主观评估,限制了实际可靠性。我们提出PerGent——一种围绕迭代式批判-精炼循环构建的工业级人格生成方法。具体而言,PerGent通过编排器协调生成器与评论型大语言模型智能体,利用用户定义的最大轮次批判-精炼循环,结合访谈、调查、职位描述等外部资源迭代优化人格。我们在Kinaxis的工业环境中部署并评估PerGent,与包括单次生成方法在内的三种基线方法进行比较。专家现场评估显示,PerGent获得了最高专家认可率(96.9%),超越所有基线方法。我们进一步将PerGent生成的人格与领域专家在大语言模型应用前手工创建的最佳实践人格进行比较。相较于基线方法,PerGent在复现更多专家内容的同时,还贡献了超越大语言模型时代人格模型的显著新内容。最后总结了在Kinaxis部署与评估PerGent过程中获得的经验教训。