Internet-based economies and societies are drowning in deceptive attacks. These attacks take many forms, such as fake news, phishing, and job scams, which we call ``domains of deception.'' Machine-learning and natural-language-processing researchers have been attempting to ameliorate this precarious situation by designing domain-specific detectors. Only a few recent works have considered domain-independent deception. We collect these disparate threads of research and investigate domain-independent deception. First, we provide a new computational definition of deception and break down deception into a new taxonomy. Then, we analyze the debate on linguistic cues for deception and supply guidelines for systematic reviews. Finally, we investigate common linguistic features and give evidence for knowledge transfer across different forms of deception.
翻译:基于互联网的经济与社会正深陷于欺骗性攻击的泥沼。这些攻击形式多样,如虚假新闻、网络钓鱼和招聘诈骗,我们称之为"欺骗领域"。机器学习与自然语言处理研究者一直试图通过设计领域特定检测器来改善这一严峻态势,但仅少数近期工作关注领域无关欺骗问题。我们整合这些零散的研究脉络,对领域无关欺骗展开系统性探究。首先,我们提出新的计算层面的欺骗定义,并将其分解为新的分类体系。进而,我们分析关于欺骗语言学线索的学术争议,为系统综述提供指导原则。最后,我们探究通用语言学特征,并提供跨不同欺骗形式的知识迁移证据。