The predicted increase in demand for data-intensive solution development is driving the need for software, data, and domain experts to effectively collaborate in multi-disciplinary data-intensive software teams (MDSTs). We conducted a socio-technical grounded theory study through interviews with 24 practitioners in MDSTs to better understand the challenges these teams face when delivering data-intensive software solutions. The interviews provided perspectives across different types of roles including domain, data and software experts, and covered different organisational levels from team members, team managers to executive leaders. We found that the key concern for these teams is dealing with data-related challenges. In this paper, we present the theory of dealing with data challenges that explains the challenges faced by MDSTs including gaining access to data, aligning data, understanding data, and resolving data quality issues; the context in and condition under which these challenges occur, the causes that lead to the challenges, and the related consequences such as having to conduct remediation activities, inability to achieve expected outcomes and lack of trust in the delivered solutions. We also identified contingencies or strategies applied to address the challenges including high-level strategic approaches such as implementing data governance, implementing new tools and techniques such as data quality visualisation and monitoring tools, as well as building stronger teams by focusing on people dynamics, communication skill development and cross-skilling. Our findings have direct implications for practitioners and researchers to better understand the landscape of data challenges and how to deal with them.
翻译:针对数据密集型解决方案开发需求的预期增长,促使软件、数据和领域专家在多学科数据密集型软件团队(MDSTs)中开展有效协作。我们通过对MDSTs中的24名从业者进行访谈,开展了一项社会技术扎根理论研究,以更深入地了解这些团队在交付数据密集型软件解决方案时面临的挑战。访谈涵盖了领域专家、数据专家和软件专家等不同角色视角,并涉及团队成员、团队经理到执行领导等不同组织层级。研究发现,这些团队面临的核心问题在于应对与数据相关的挑战。本文提出应对数据挑战的理论,阐释了MDSTs面临的挑战,包括数据获取、数据对齐、数据理解及解决数据质量问题;这些挑战发生的背景与条件、导致挑战的原因,以及由此产生的后果,例如需开展补救活动、无法达成预期成果以及对交付解决方案缺乏信任。我们还归纳了应对挑战的应急措施或策略,包括实施数据治理等高层级战略方法,采用数据质量可视化与监控工具等新技术手段,以及通过关注人员动态、沟通技能培养和跨技能培训来建设更强大的团队。我们的研究结果对从业者和研究人员更深入理解数据挑战的格局及其应对方法具有直接指导意义。