Newcomers to a software project must overcome many barriers before they can successfully place their first code contribution, and they often struggle to find information that is relevant to them. In this work, we argue that much of the information needed by newcomers already exists, albeit scattered among many different sources, and that many barriers can be addressed by automatically identifying, extracting, generating, summarizing, and presenting documentation that is specifically aimed and customized for newcomers. To gain a detailed understanding of the processes followed by newcomers and their information needs before making their first code contribution, we conducted an empirical study. Based on a survey with about 100 practitioners, grounded theory analysis, and validation interviews, we contribute a 16-step model for the processes followed by newcomers to a software project and we identify relevant information, along with individual and project characteristics that influence the relevancy of information types and sources. Our findings form an essential step towards automated tool support that provides relevant information to project newcomers in each step of their contribution processes.
翻译:软件项目的新手在成功完成首次代码贡献之前必须克服许多障碍,且常常难以找到与其相关的信息。在这项工作中,我们认为新手所需的大部分信息已经存在,尽管分散在众多不同来源中,并且许多障碍可以通过自动识别、提取、生成、总结以及呈现专为新手定制和优化的文档来解决。为了深入了解新手在首次代码贡献前的流程及信息需求,我们开展了一项实证研究。基于对约100名从业者的调查、扎根理论分析和验证性访谈,我们构建了一个包含16个步骤的模型来描述新手在软件项目中的流程,并识别出相关信息,以及影响信息类型和来源相关性的个人与项目特征。我们的发现为开发自动化工具支持奠定了基础,这类工具可在新手贡献流程的每一步提供相关信息。