Personalized text generation requires models not only to produce coherent text but also to align with a target user's style, tone, and topical focus. Existing retrieval-augmented approaches such as LaMP and PGraphRAG enrich profiles with user and neighbor histories, but they stop at generation and often yield outputs that drift in tone, topic, or style. We present PerFine, a unified, training-free critique-refine framework that enhances personalization through iterative, profile-grounded feedback. In each iteration, an LLM generator produces a draft conditioned on the retrieved profile, and a critic LLM - also conditioned on the same profile - provides structured feedback on tone, vocabulary, sentence structure, and topicality. The generator then revises, while a novel knockout strategy retains the stronger draft across iterations. We further study additional inference-time strategies such as Best-of-N and Topic Extraction to balance quality and efficiency. Across Yelp, Goodreads, and Amazon datasets, PerFine consistently improves personalization over PGraphRAG, with GEval gains of +7-13%, steady improvements over 3-5 refinement iterations, and scalability with increasing critic size. These results highlight that post-hoc, profile-aware feedback offers a powerful paradigm for personalized LLM generation that is both training-free and model-agnostic.
翻译:个性化文本生成不仅要求模型生成连贯的文本,还需使其与目标用户的风格、语气及主题焦点保持一致。现有的检索增强方法(如LaMP和PGraphRAG)通过用户及邻近历史记录丰富用户画像,但仅止步于生成阶段,常导致输出在语气、主题或风格上发生偏移。本文提出PerFine,一个统一的、无需训练的批判-精炼框架,通过基于用户画像的迭代反馈来增强个性化。在每次迭代中,一个大型语言模型生成器基于检索到的用户画像生成草稿,而另一个同样基于该画像的批判型大型语言模型则对语气、词汇、句子结构和主题相关性提供结构化反馈。随后生成器进行修订,同时采用一种新颖的淘汰策略在迭代间保留较优的草稿。我们进一步研究了额外的推理时策略,如Best-of-N和主题提取,以平衡质量与效率。在Yelp、Goodreads和Amazon数据集上的实验表明,PerFine在个性化方面持续优于PGraphRAG,GEval指标提升7-13%,在3-5次精炼迭代中保持稳定改进,且随批判模型规模扩大而具备可扩展性。这些结果凸显了基于用户画像的事后反馈为个性化大型语言模型生成提供了一个强大范式,该范式既无需训练,又具有模型无关性。