Understanding the current research trends, problems, and their innovative solutions remains a bottleneck due to the ever-increasing volume of scientific articles. In this paper, we propose NLPExplorer, a completely automatic portal for indexing, searching, and visualizing Natural Language Processing (NLP) research volume. NLPExplorer presents interesting insights from papers, authors, venues, and topics. In contrast to previous topic modelling based approaches, we manually curate five course-grained non-exclusive topical categories namely Linguistic Target (Syntax, Discourse, etc.), Tasks (Tagging, Summarization, etc.), Approaches (unsupervised, supervised, etc.), Languages (English, Chinese,etc.) and Dataset types (news, clinical notes, etc.). Some of the novel features include a list of young popular authors, popular URLs, and datasets, a list of topically diverse papers and recent popular papers. Also, it provides temporal statistics such as yearwise popularity of topics, datasets, and seminal papers. To facilitate future research and system development, we make all the processed datasets accessible through API calls. The current system is available at http://lingo.iitgn.ac.in:5001/
翻译:理解当前研究趋势、问题及其创新解决方案仍因科学文献数量的持续增长而成为瓶颈。本文提出NLPExplorer,一个用于索引、搜索和可视化自然语言处理(NLP)研究文献的完全自动化门户。NLPExplorer从论文、作者、会议/期刊和主题中呈现有价值的洞见。与以往基于主题建模的方法不同,我们人工整理出五个粗粒度非排他性主题类别,即语言目标(句法、语篇等)、任务(词性标注、摘要等)、方法(无监督、有监督等)、语言(英语、中文等)和数据集类型(新闻、临床记录等)。其中一些创新功能包括:新兴热门作者列表、流行URL与数据集、主题多样性论文列表以及近期热门论文。此外,系统还提供时序统计数据,如主题、数据集及开创性论文的年度流行度。为促进未来研究与系统开发,我们通过API接口开放所有已处理数据集。当前系统可访问http://lingo.iitgn.ac.in:5001/。