We analysed a sample of NLP research papers archived in ACL Anthology as an attempt to quantify the degree of openness and the benefit of such an open culture in the NLP community. We observe that papers published in different NLP venues show different patterns related to artefact reuse. We also note that more than 30% of the papers we analysed do not release their artefacts publicly, despite promising to do so. Further, we observe a wide language-wise disparity in publicly available NLP-related artefacts.
翻译:我们以量化自然语言处理(NLP)领域开放程度及其开放文化效益为目标,对ACL Anthology中收录的NLP研究论文样本进行了分析。我们观察到,发表于不同NLP会议或期刊的论文在研究成果(如代码、数据)的复用方面呈现出不同的模式。同时我们注意到,超过30%的被分析论文尽管承诺公开其研究成果,却并未实际公开。此外,我们发现公开可用的NLP相关研究成果存在显著的语种间差异。