NLP Workbench is a web-based platform for text mining that allows non-expert users to obtain semantic understanding of large-scale corpora using state-of-the-art text mining models. The platform is built upon latest pre-trained models and open source systems from academia that provide semantic analysis functionalities, including but not limited to entity linking, sentiment analysis, semantic parsing, and relation extraction. Its extensible design enables researchers and developers to smoothly replace an existing model or integrate a new one. To improve efficiency, we employ a microservice architecture that facilitates allocation of acceleration hardware and parallelization of computation. This paper presents the architecture of NLP Workbench and discusses the challenges we faced in designing it. We also discuss diverse use cases of NLP Workbench and the benefits of using it over other approaches. The platform is under active development, with its source code released under the MIT license. A website and a short video demonstrating our platform are also available.
翻译:NLP Workbench是一个基于Web的文本挖掘平台,允许非专业用户利用最先进的文本挖掘模型获取大规模语料库的语义理解。该平台基于学术界最新的预训练模型和开源系统构建,提供语义分析功能,包括但不限于实体链接、情感分析、语义解析和关系抽取。其可扩展设计使研究人员和开发者能够平滑地替换现有模型或集成新模型。为提高效率,我们采用了微服务架构,以促进加速硬件的分配和计算的并行化。本文介绍了NLP Workbench的架构,并讨论了我们在设计过程中面临的挑战。我们还讨论了NLP Workbench的多种用例及其相对于其他方法的优势。该平台正在积极开发中,其源代码在MIT许可证下发布。此外,还提供了展示该平台的网站和短视频。