A wide range of transformer-based language models have been proposed for information retrieval tasks. However, fine-tuning and inference of these models is often complex and requires substantial engineering effort. This paper introduces Lightning IR, a PyTorch Lightning-based framework for fine-tuning and inference of transformer-based language models for information retrieval. Lightning IR provides a modular and extensible architecture that supports all stages of an information retrieval pipeline: from fine-tuning and indexing to searching and re-ranking. It is designed to be straightforward to use, scalable, and reproducible. Lightning IR is available as open-source: https://github.com/webis-de/lightning-ir.
翻译:针对信息检索任务,已有大量基于Transformer的语言模型被提出。然而,这些模型的微调与推理过程通常较为复杂,且需要大量的工程投入。本文介绍闪电IR——一个基于PyTorch Lightning的框架,用于信息检索中基于Transformer的语言模型的微调与推理。闪电IR提供了模块化且可扩展的架构,支持信息检索流程的所有阶段:从微调与索引到搜索与重排序。该框架设计为易于使用、可扩展且可复现。闪电IR已作为开源项目发布:https://github.com/webis-de/lightning-ir。