Our study demonstrates the effective use of Large Language Models (LLMs) for automating the classification of complex datasets. We specifically target proposals of Decentralized Autonomous Organizations (DAOs), as the classification of this data requires the understanding of context and, therefore, depends on human expertise, leading to high costs associated with the task. The study applies an iterative approach to specify categories and further refine them and the prompt in each iteration, which led to an accuracy rate of 95% in classifying a set of 100 proposals. With this, we demonstrate the potential of LLMs to automate data labeling tasks that depend on textual context effectively.
翻译:本研究展示了利用大型语言模型(LLMs)高效自动化分类复杂数据集的方法。我们专门针对去中心化自治组织(DAOs)的提案进行研究,因为此类数据分类需要理解上下文语境,进而依赖人工专家经验,导致相关任务成本高昂。我们采用迭代方法逐步明确分类类别并持续优化类别定义与提示词,最终在100份提案的分类任务中实现了95%的准确率。由此证明,大型语言模型能够有效自动化处理依赖文本语境的标注任务。