Biological species delimitation based on genetic and spatial dissimilarity: a comparative study

The delimitation of biological species, i.e., deciding which individuals belong to the same species and whether and how many different species are represented in a data set, is key to the conservation of biodiversity. Much existing work uses only genetic data for species delimitation, often employing some kind of cluster analysis. This can be misleading, because geographically distant groups of individuals can be genetically quite different even if they belong to the same species. This paper investigates the problem of testing whether two potentially separated groups of individuals can belong to a single species or not based on genetic and spatial data. Various approaches are compared (some of which already exist in the literature) based on simulated metapopulations generated with SLiM and GSpace - two software packages that can simulate spatially-explicit genetic data at an individual level. Approaches involve partial Mantel testing, maximum likelihood mixed-effects models with a population effect, and jackknife-based homogeneity tests. A key challenge is that most tests perform on genetic and geographical distance data, violating standard independence assumptions. Simulations showed that partial Mantel tests and mixed-effects models have larger power than jackknife-based methods, but tend to display type-I-error rates slightly above the significance level. Moreover, a multiple regression model neglecting the dependence in the dissimilarities did not show inflated type-I-error rate. An application on brassy ringlets concludes the paper.

翻译：生物物种的界定——即确定哪些个体属于同一物种，以及数据集是否包含及包含多少不同物种——是生物多样性保护的关键。现有研究多仅依赖遗传数据进行物种界定，通常采用某种聚类分析方法。这种方法可能产生误导，因为即使属于同一物种，地理上相隔较远的个体群在遗传上也可能存在显著差异。本文研究如何基于遗传与空间数据判断两个可能分离的个体群是否属于同一物种的问题。基于SLiM和GSpace（两款可在个体层面模拟空间显式遗传数据的软件包）生成的模拟集合种群，本文比较了多种方法（其中部分已有文献记载）。具体方法包括偏Mantel检验、含群体效应的最大似然混合效应模型，以及基于刀切法的同质性检验。核心挑战在于大多数检验基于遗传与地理距离数据进行，违反了标准独立性假设。模拟结果显示：偏Mantel检验和混合效应模型的统计功效高于基于刀切法的方法，但其第一类错误率略高于显著性水平。此外，忽略相异度间依赖性的多元回归模型并未表现出膨胀的第一类错误率。本文最后以铜色环蝶的应用实例作为总结。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日