Regulatory bodies worldwide are intensifying their efforts to ensure transparency in influencer marketing on social media through instruments like the Unfair Commercial Practices Directive (UCPD) in the European Union, or Section 5 of the Federal Trade Commission Act. Yet enforcing these obligations has proven to be highly problematic due to the sheer scale of the influencer market. The task of automatically detecting sponsored content aims to enable the monitoring and enforcement of such regulations at scale. Current research in this field primarily frames this problem as a machine learning task, focusing on developing models that achieve high classification performance in detecting ads. These machine learning tasks rely on human data annotation to provide ground truth information. However, agreement between annotators is often low, leading to inconsistent labels that hinder the reliability of models. To improve annotation accuracy and, thus, the detection of sponsored content, we propose using chatGPT to augment the annotation process with phrases identified as relevant features and brief explanations. Our experiments show that this approach consistently improves inter-annotator agreement and annotation accuracy. Additionally, our survey of user experience in the annotation task indicates that the explanations improve the annotators' confidence and streamline the process. Our proposed methods can ultimately lead to more transparency and alignment with regulatory requirements in sponsored content detection.
翻译:全球监管机构正通过欧盟《不公平商业行为指令》(UCPD)及美国《联邦贸易委员会法》第5条等工具,加大力度确保社交媒体影响者营销的透明度。然而,由于影响者市场规模庞大,履行这些义务被证明极具挑战性。自动化检测赞助内容的任务旨在实现对此类法规的大规模监控与执行。目前该领域研究主要将问题定义为机器学习任务,聚焦于开发高分类性能的广告检测模型。这些机器学习任务依赖人工数据标注提供基准真相信息。然而,标注者间一致性往往较低,导致标签不一致,削弱了模型的可靠性。为提升标注准确性进而改善赞助内容检测,我们提出使用ChatGPT通过识别相关特征短语及简短解释来增强标注流程。实验表明,该方法能持续提升标注者间一致性和标注准确性。此外,我们对标注任务用户体验的调查显示,该解释增强了标注者的信心并简化了流程。我们提出的方法最终可促进赞助内容检测的透明度,并更符合监管要求。