With the advancement of Large Language Models (LLMs), increasingly sophisticated and powerful GPTs are entering the market. Despite their popularity, the LLM ecosystem still remains unexplored. Additionally, LLMs' susceptibility to attacks raises concerns over safety and plagiarism. Thus, in this work, we conduct a pioneering exploration of GPT stores, aiming to study vulnerabilities and plagiarism within GPT applications. To begin with, we conduct, to our knowledge, the first large-scale monitoring and analysis of two stores, an unofficial GPTStore.AI, and an official OpenAI GPT Store. Then, we propose a TriLevel GPT Reversing (T-GR) strategy for extracting GPT internals. To complete these two tasks efficiently, we develop two automated tools: one for web scraping and another designed for programmatically interacting with GPTs. Our findings reveal a significant enthusiasm among users and developers for GPT interaction and creation, as evidenced by the rapid increase in GPTs and their creators. However, we also uncover a widespread failure to protect GPT internals, with nearly 90% of system prompts easily accessible, leading to considerable plagiarism and duplication among GPTs.
翻译:随着大型语言模型(LLMs)的发展,日益复杂且功能强大的GPT应用正进入市场。尽管它们广受欢迎,但LLM生态系统仍未被充分探索。此外,LLM对攻击的敏感性引发了安全与抄袭方面的担忧。因此,在本工作中,我们对GPT商店进行了开创性探索,旨在研究GPT应用中的脆弱性与抄袭问题。首先,我们开展了据我们所知的首次大规模监控与分析,涵盖非官方GPTStore.AI和官方OpenAI GPT商店。随后,我们提出了一种三级GPT逆向(T-GR)策略,用于提取GPT内部机制。为高效完成这两项任务,我们开发了两款自动化工具:一款用于网络爬取,另一款专为通过程序与GPT交互而设计。我们的研究结果揭示了用户和开发者对GPT交互与创作的极大热情,表现为GPT及其创建者数量的快速增长。然而,我们也发现GPT内部机制的保护普遍缺失,近90%的系统提示可被轻易获取,导致GPT之间存在严重的抄袭与重复现象。