Effective linguistic choices that attract potential customers play crucial roles in advertising success. This study aims to explore the linguistic features of ad texts that influence human preferences. Although the creation of attractive ad texts is an active area of research, progress in understanding the specific linguistic features that affect attractiveness is hindered by several obstacles. First, human preferences are complex and influenced by multiple factors, including their content, such as brand names, and their linguistic styles, making analysis challenging. Second, publicly available ad text datasets that include human preferences are lacking, such as ad performance metrics and human feedback, which reflect people's interests. To address these problems, we present AdParaphrase, a paraphrase dataset that contains human preferences for pairs of ad texts that are semantically equivalent but differ in terms of wording and style. This dataset allows for preference analysis that focuses on the differences in linguistic features. Our analysis revealed that ad texts preferred by human judges have higher fluency, longer length, more nouns, and use of bracket symbols. Furthermore, we demonstrate that an ad text-generation model that considers these findings significantly improves the attractiveness of a given text. The dataset is publicly available at: https://github.com/CyberAgentAILab/AdParaphrase.
翻译:能够吸引潜在客户的有效语言选择在广告成功中扮演着关键角色。本研究旨在探索影响人类偏好的广告文本的语言特征。尽管生成吸引人的广告文本是一个活跃的研究领域,但在理解影响吸引力的具体语言特征方面的进展受到若干障碍的阻碍。首先,人类偏好是复杂的,受多种因素影响,包括其内容(如品牌名称)和语言风格,这使得分析具有挑战性。其次,缺乏包含人类偏好的公开广告文本数据集,例如反映人们兴趣的广告效果指标和人类反馈。为了解决这些问题,我们提出了AdParaphrase,一个释义数据集,其中包含人类对语义等效但在措辞和风格上存在差异的广告文本对的偏好。该数据集支持专注于语言特征差异的偏好分析。我们的分析表明,人类评判者更偏好的广告文本具有更高的流畅性、更长的长度、更多的名词以及括号符号的使用。此外,我们证明,一个考虑了这些发现的广告文本生成模型能显著提高给定文本的吸引力。该数据集公开发布于:https://github.com/CyberAgentAILab/AdParaphrase。