Programming is ubiquitous in applied biostatistics; adopting software engineering skills will help biostatisticians do a better job. To explain this, we start by highlighting key challenges for software development and application in biostatistics. Silos between different statistician roles, projects, departments, and organizations lead to the development of duplicate and suboptimal code. Building on top of open-source software requires critical appraisal and risk-based assessment of the used modules. Code that is written needs to be readable to ensure reliable software. The software needs to be easily understandable for the user, as well as developed within testing frameworks to ensure that long term maintenance of the software is feasible. Finally, the reproducibility of research results is hindered by manual analysis workflows and uncontrolled code development. We next describe how the awareness of the importance and application of good software engineering practices and strategies can help address these challenges. The foundation is a better education in basic software engineering skills in schools, universities, and during the work life. Dedicated software engineering teams within academic institutions and companies can be a key factor for the establishment of good software engineering practices and catalyze improvements across research projects. Providing attractive career paths is important for the retainment of talents. Readily available tools can improve the reproducibility of statistical analyses and their use can be exercised in community events. [...]
翻译:编程在应用生物统计学中无处不在;掌握软件工程技能有助于生物统计学家更好地开展工作。为阐明这一点,我们首先强调生物统计学中软件开发与应用面临的关键挑战。不同统计学家角色、项目、部门和组织之间的孤岛效应导致重复且低效代码的开发。基于开源软件开发需要进行严格评估及对所用模块的风险评估。编写代码需具备可读性以确保软件可靠性。软件需对用户易于理解,并在测试框架内开发以保证长期维护的可行性。此外,手动分析流程与缺乏管控的代码开发阻碍了研究结果的可重复性。接着,我们阐述对良好软件工程实践与策略重要性及应用的认识如何帮助应对这些挑战。其基础在于学校、大学及工作期间加强基础软件工程技能教育。学术机构与企业中设立专门的软件工程团队,是建立良好软件工程实践的关键因素,并能催化研究项目的改进。提供有吸引力的职业路径对人才留存至关重要。现有工具可提高统计分析的可重复性,其使用可通过社区活动加以实践。[……]