We introduce JFinTEB, the first comprehensive benchmark specifically designed for evaluating Japanese financial text embeddings. Existing embedding benchmarks provide limited coverage of language-specific and domain-specific aspects found in Japanese financial texts. Our benchmark encompasses diverse task categories including retrieval and classification tasks that reflect realistic and well-defined financial text processing scenarios. The retrieval tasks leverage instruction-following datasets and financial text generation queries, while classification tasks cover sentiment analysis, document categorization, and domain-specific classification challenges derived from economic survey data. We conduct extensive evaluations across a wide range of embedding models, including Japanese-specific models of various sizes, multilingual models, and commercial embedding services. We publicly release JFinTEB datasets and evaluation framework at https://github.com/retarfi/JFinTEB to facilitate future research and provide a standardized evaluation protocol for the Japanese financial text mining community. This work addresses a critical gap in Japanese financial text processing resources and establishes a foundation for advancing domain-specific embedding research.
翻译:我们推出了JFinTEB,这是首个专为评估日语金融文本嵌入而设计的综合性基准。现有的嵌入基准对日语金融文本中的语言特有和领域特有方面的覆盖有限。我们的基准涵盖了包括检索和分类任务在内的多样化任务类别,这些任务反映了现实且定义明确的金融文本处理场景。检索任务采用了指令遵循数据集和金融文本生成查询,而分类任务则涵盖了情感分析、文档分类以及源自经济调查数据的领域特有分类挑战。我们对多种嵌入模型进行了广泛评估,包括不同规模的日语特有模型、多语言模型以及商业嵌入服务。我们在https://github.com/retarfi/JFinTEB上公开发布了JFinTEB数据集和评估框架,以促进未来研究,并为日语金融文本挖掘社区提供标准化的评估协议。这项工作填补了日语金融文本处理资源方面的关键空白,为推进领域特有嵌入研究奠定了基础。