Approaches to biological species delimitation based on genetic and spatial dissimilarity

from arxiv, Paper of 26 pages with 6 figures; appendix of 19 pages with 17 figures. February 2024 update: tiny notation edit, results unchanged. April 2024 update: additional simulation results and plots; introduction and description of the methodologies edited; broader appendix with new charts. June 2024 update: Minor edits in methods description

The delimitation of biological species, i.e., deciding which individuals belong to the same species and whether and how many different species are represented in a data set, is key to the conservation of biodiversity. Much existing work uses only genetic data for species delimitation, often employing some kind of cluster analysis. This can be misleading, because geographically distant groups of individuals can be genetically quite different even if they belong to the same species. We investigate the problem of testing whether two potentially separated groups of individuals can belong to a single species or not based on genetic and spatial data. Existing methods such as the partial Mantel test and jackknife-based distance-distance regression are considered. New approaches, i.e., an adaptation of a mixed effects model, a bootstrap approach, and a jackknife version of partial Mantel, are proposed. All these methods address the issue that distance data violate the independence assumption for standard inference regarding correlation and regression; a standard linear regression is also considered. The approaches are compared on simulated meta-populations generated with SLiM and GSpace - two software packages that can simulate spatially-explicit genetic data at an individual level. Simulations show that the new jackknife version of the partial Mantel test provides a good compromise between power and respecting the nominal type I error rate. Mixed-effects models have larger power than jackknife-based methods, but tend to display type I error rates slightly above the significance level. An application on brassy ringlets concludes the paper.

翻译：生物物种的界定——即判定哪些个体属于同一物种，以及数据集中是否及如何代表不同物种——是保护生物多样性的关键。现有研究多仅利用遗传数据进行物种界定，常采用某种聚类分析方法。这可能产生误导，因为地理上相隔较远的个体群体即使属于同一物种，在遗传上也可能存在显著差异。本文研究基于遗传与空间数据，检验两个可能分离的个体群体是否属于同一物种的问题。文中评估了现有方法，如部分曼特尔检验和基于刀切法的距离-距离回归，并提出了新方法：混合效应模型的改进方案、自助法策略以及部分曼特尔检验的刀切法变体。这些方法均致力于解决距离数据违反标准相关性与回归推断中独立性假设的问题；同时亦考察了标准线性回归。通过使用SLiM和GSpace——两款能在个体层面模拟空间显式遗传数据的软件包——生成的模拟元种群对所有方法进行比较。模拟结果表明，部分曼特尔检验的新刀切法变体在统计功效与维持名义I类错误率之间实现了良好平衡。混合效应模型较基于刀切法的方法具有更高功效，但其I类错误率易略高于显著性水平。最后以铜色眼蝶的实际应用案例作为论文的总结。

相关内容

GROUP

关注 1

Group一直是研究计算机支持的合作工作、人机交互、计算机支持的协作学习和社会技术研究的主要场所。该会议将社会科学、计算机科学、工程、设计、价值观以及其他与小组工作相关的多个不同主题的工作结合起来，并进行了广泛的概念化。官网链接：https://group.acm.org/conferences/group20/