基于单标记提示的小说家风格生成、评估与解释 (Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts)

Recent advances in large language models have created new opportunities for stylometry, the study of writing styles and authorship. Two challenges, however, remain central: training generative models when no paired data exist, and evaluating stylistic text without relying only on human judgment. In this work, we present a framework for both generating and evaluating sentences in the style of 19th-century novelists. Large language models are fine-tuned with minimal, single-token prompts to produce text in the voices of authors such as Dickens, Austen, Twain, Alcott, and Melville. To assess these generative models, we employ a transformer-based detector trained on authentic sentences, using it both as a classifier and as a tool for stylistic explanation. We complement this with syntactic comparisons and explainable AI methods, including attention-based and gradient-based analyses, to identify the linguistic cues that drive stylistic imitation. Our findings show that the generated text reflects the authors' distinctive patterns and that AI-based evaluation offers a reliable alternative to human assessment. All artifacts of this work are published online.

翻译：大型语言模型的最新进展为文体测量学——即写作风格与作者身份的研究——创造了新的机遇。然而，两个核心挑战依然存在：在缺乏配对数据时训练生成模型，以及在不依赖人类主观判断的情况下评估风格化文本。本研究提出一个框架，用于生成和评估19世纪小说家风格的句子。我们使用极简的单标记提示对大型语言模型进行微调，使其能够生成狄更斯、奥斯汀、吐温、奥尔科特和梅尔维尔等作家风格的文本。为评估这些生成模型，我们采用基于Transformer的检测器，该检测器在真实句子上训练，既作为分类器使用，也作为风格解释工具。我们辅以句法比较和可解释人工智能方法，包括基于注意力机制和梯度的分析，以识别驱动风格模仿的语言线索。研究结果表明，生成的文本反映了作者独特的写作模式，且基于人工智能的评估为人类评估提供了可靠的替代方案。本工作的所有成果均已在线发布。