Social media platforms such as Instagram and Twitter have emerged as critical channels for drug marketing and illegal sale. Detecting and labeling online illicit drug trafficking activities becomes important in addressing this issue. However, the effectiveness of conventional supervised learning methods in detecting drug trafficking heavily relies on having access to substantial amounts of labeled data, while data annotation is time-consuming and resource-intensive. Furthermore, these models often face challenges in accurately identifying trafficking activities when drug dealers use deceptive language and euphemisms to avoid detection. To overcome this limitation, we conduct the first systematic study on leveraging large language models (LLMs), such as ChatGPT, to detect illicit drug trafficking activities on social media. We propose an analytical framework to compose \emph{knowledge-informed prompts}, which serve as the interface that humans can interact with and use LLMs to perform the detection task. Additionally, we design a Monte Carlo dropout based prompt optimization method to further to improve performance and interpretability. Our experimental findings demonstrate that the proposed framework outperforms other baseline language models in terms of drug trafficking detection accuracy, showing a remarkable improvement of nearly 12\%. By integrating prior knowledge and the proposed prompts, ChatGPT can effectively identify and label drug trafficking activities on social networks, even in the presence of deceptive language and euphemisms used by drug dealers to evade detection. The implications of our research extend to social networks, emphasizing the importance of incorporating prior knowledge and scenario-based prompts into analytical tools to improve online security and public safety.
翻译:社交媒体平台如Instagram和Twitter已成为毒品营销与非法销售的关键渠道。对在线非法毒品贩运活动进行检测与标注对于解决这一问题至关重要。然而,传统监督学习方法在检测毒品贩运方面的有效性高度依赖于大量标注数据的获取,而数据标注既耗时又消耗大量资源。此外,当毒贩使用欺骗性语言和委婉语规避检测时,这些模型往往难以准确识别贩运活动。为克服这一局限,我们首次系统研究了利用大型语言模型(如ChatGPT)检测社交媒体上的非法毒品贩运活动。我们提出了一种分析框架,用于构建知识信息提示,该提示作为人类与大语言模型交互的接口,使其能够执行检测任务。此外,我们设计了一种基于蒙特卡洛丢弃的提示优化方法,以进一步提升性能与可解释性。实验结果表明,所提框架在毒品贩运检测准确率上优于其他基准语言模型,实现了近12%的显著提升。通过整合先验知识与所提提示,ChatGPT能够有效识别并标注社交网络中的毒品贩运活动,即使在毒贩使用欺骗性语言和委婉语规避检测时仍能保持性能。本研究对社交网络的影响深远,强调了将先验知识与情境化提示融入分析工具以提升在线安全与公共安全的重要性。