Our study demonstrates the effective use of Large Language Models (LLMs) for automating the classification of complex datasets. We specifically target proposals of Decentralized Autonomous Organizations (DAOs), as the clas-sification of this data requires the understanding of context and, therefore, depends on human expertise, leading to high costs associated with the task. The study applies an iterative approach to specify categories and further re-fine them and the prompt in each iteration, which led to an accuracy rate of 95% in classifying a set of 100 proposals. With this, we demonstrate the po-tential of LLMs to automate data labeling tasks that depend on textual con-text effectively.
翻译:本研究展示了利用大型语言模型(LLMs)自动化分类复杂数据集的有效性。我们特别针对去中心化自治组织(DAOs)的提案进行分类,由于此类数据的分类需要理解上下文,因此通常依赖人类专家的判断,导致任务成本高昂。本研究采用迭代方法,在每次迭代中明确分类标准并进一步优化分类体系及提示词,最终在对100份提案的分类中达到了95%的准确率。由此,我们证明了LLMs在有效自动化依赖文本上下文的标注任务方面具有巨大潜力。