Insights into Deep Learning Refactoring: Bridging the Gap Between Practices and Expectations

With the rapid development of deep learning, the implementation of intricate algorithms and substantial data processing have become standard elements of deep learning projects. As a result, the code has become progressively complex as the software evolves, which is difficult to maintain and understand. Existing studies have investigated the impact of refactoring on software quality within traditional software. However, the insight of code refactoring in the context of deep learning is still unclear. This study endeavors to fill this knowledge gap by empirically examining the current state of code refactoring in deep learning realm, and practitioners' views on refactoring. We first manually analyzed the commit history of five popular and well-maintained deep learning projects (e.g., PyTorch). We mined 4,921 refactoring practices in historical commits and measured how different types and elements of refactoring operations are distributed and found that refactoring operation types' distribution in deep learning projects is different from it in traditional Java software. We then surveyed 159 practitioners about their views of code refactoring in deep learning projects and their expectations of current refactoring tools. The result of the survey showed that refactoring research and the development of related tools in the field of deep learning are crucial for improving project maintainability and code quality, and that current refactoring tools do not adequately meet the needs of practitioners. Lastly, we provided our perspective on the future advancement of refactoring tools and offered suggestions for developers' development practices.

翻译：随着深度学习技术的飞速发展，复杂算法实现与大规模数据处理已成为深度学习项目的标准要素。随之而来的是，随着软件演进，代码日趋复杂，难以维护和理解。现有研究已探讨了传统软件中重构对软件质量的影响，但在深度学习语境下，代码重构的洞见仍不明确。本研究旨在通过实证检验深度学习领域代码重构的当前状态及从业者对重构的看法，填补这一认知空白。我们首先手动分析了五个广受维护的知名深度学习项目（如PyTorch）的提交历史，从中挖掘出4,921个重构实践，并测量了不同类型及元素的重构操作分布，发现深度学习项目中重构操作类型的分布与传统Java软件存在差异。随后，我们针对159名从业者开展了问卷调查，了解他们对深度学习项目中代码重构的看法以及对现有重构工具的期望。调查结果表明，深度学习领域的重构研究及相关工具开发对于提升项目可维护性与代码质量至关重要，而当前重构工具未能充分满足从业者的需求。最后，我们展望了重构工具的未来发展方向，并为开发者的开发实践提出了建议。