Version incompatibility issues are rampant when reusing or reproducing deep learning models and applications. Existing techniques are limited to library dependency specifications declared in PyPI. Therefore, these techniques cannot detect version issues due to undocumented version constraints or issues involving hardware drivers or OS. To address this challenge, we propose to leverage the abundant discussions of DL version issues from Stack Overflow to facilitate version incompatibility detection. We reformulate the problem of knowledge extraction as a Question-Answering (QA) problem and use a pre-trained QA model to extract version compatibility knowledge from online discussions. The extracted knowledge is further consolidated into a weighted knowledge graph to detect potential version incompatibilities when reusing a DL project. Our evaluation results show that (1) our approach can accurately extract version knowledge with 84% accuracy, and (2) our approach can accurately identify 65% of known version issues in 10 popular DL projects with a high precision (92%), while two state-of-the-art approaches can only detect 29% and 6% of these issues with 33% and 17% precision respectively.
翻译:版本不兼容问题在重用或复现深度学习模型与应用时普遍存在。现有技术仅局限于PyPI声明的库依赖规范,因此无法检测因未文档化的版本约束或涉及硬件驱动及操作系统的问题而导致的版本缺陷。为应对这一挑战,我们提出利用Stack Overflow上关于深度学习版本问题的丰富讨论来促进版本不兼容检测。我们将知识提取问题重构为问答(QA)任务,并采用预训练QA模型从在线讨论中提取版本兼容性知识。提取的知识进一步整合为加权知识图谱,用于在重用深度学习项目时检测潜在的版本不兼容问题。评估结果表明:(1)我们的方法能以84%的准确率精准提取版本知识;(2)在10个流行深度学习项目中,本方法能以92%的高精密度识别65%的已知版本问题,而两种最先进的方法分别仅能检测29%和6%的问题,且精密度分别为33%与17%。