This paper describes the work of the UniBuc Archaeology team for CLPsych's 2024 Shared Task, which involved finding evidence within the text supporting the assigned suicide risk level. Two types of evidence were required: highlights (extracting relevant spans within the text) and summaries (aggregating evidence into a synthesis). Our work focuses on evaluating Large Language Models (LLM) as opposed to an alternative method that is much more memory and resource efficient. The first approach employs a good old-fashioned machine learning (GOML) pipeline consisting of a tf-idf vectorizer with a logistic regression classifier, whose representative features are used to extract relevant highlights. The second, more resource intensive, uses an LLM for generating the summaries and is guided by chain-of-thought to provide sequences of text indicating clinical markers.
翻译:本文描述了UniBuc考古团队在CLPsych 2024共享任务中的工作,该任务涉及在文本中寻找支持所分配自杀风险等级的证据。需要两类证据:高亮(提取文本中的相关片段)和摘要(将证据整合为综述)。我们的工作侧重于评估大型语言模型(LLM),并与一种内存和资源效率更高的替代方法进行对比。第一种方法采用经典机器学习(GOML)流程,包含基于tf-idf向量化器与逻辑回归分类器的管线,其代表性特征用于提取相关高亮文本。第二种资源密集度更高的方法则使用LLM生成摘要,并通过思维链引导提供指示临床标记的文本序列。