The advent of large language models (LLMs) has brought about a revolution in the development of tailored machine learning models and sparked debates on redefining data requirements. The automation facilitated by the training and implementation of LLMs has led to discussions and aspirations that human-level labeling interventions may no longer hold the same level of importance as in the era of supervised learning. This paper presents compelling arguments supporting the ongoing relevance of human-labeled data in the era of LLMs.
翻译:大语言模型(LLMs)的出现不仅推动了定制化机器学习模型开发的革命,还引发了关于重新定义数据需求的争论。LLMs训练与实施所实现的自动化,促使人们讨论并憧憬:人工标注干预的重要性可能不再像监督学习时代那样突出。本文提出了有力论据,论证人类标注数据在大语言模型时代仍具有持续的相关性。