Computer simulations are an important tool for studying the mechanics of biological evolution. In particular, in silico work with agent-based models provides an opportunity to collect high-quality records of ancestry relationships among simulated agents. Such phylogenies can provide insight into evolutionary dynamics within these simulations. Existing work generally tracks lineages directly, yielding an exact phylogenetic record of evolutionary history. However, direct tracking can be inefficient for large-scale, many-processor evolutionary simulations. An alternate approach to extracting phylogenetic information from simulation that scales more favorably is post hoc estimation, akin to how bioinformaticians build phylogenies by assessing genetic similarities between organisms. Recently introduced ``hereditary stratigraphy'' algorithms provide means for efficient inference of phylogenetic history from non-coding annotations on simulated organisms' genomes. A number of options exist in configuring hereditary stratigraphy methodology, but no work has yet tested how they impact reconstruction quality. To address this question, we surveyed reconstruction accuracy under alternate configurations across a matrix of evolutionary conditions varying in selection pressure, spatial structure, and ecological dynamics. We synthesize results from these experiments to suggest a prescriptive system of best practices for work with hereditary stratigraphy, ultimately guiding researchers in choosing appropriate instrumentation for large-scale simulation studies.
翻译:计算机模拟是研究生物进化机制的重要工具。特别是在基于智能体的模型中进行硅基实验,为收集模拟智能体间亲缘关系的高质量记录提供了机会。此类系统发育数据能够揭示这些模拟中的进化动力学。现有工作通常直接追踪谱系,从而获得进化历史的精确系统发育记录。然而,对于大规模、多处理器的进化模拟,直接追踪方法可能效率低下。另一种可扩展性更优的提取系统发育信息的方法是事后估计,类似于生物信息学家通过评估生物体间遗传相似性构建系统发育树的方法。近期提出的"遗传地层学"算法提供了一种高效推断方法,可通过模拟生物体基因组上的非编码注释来重建系统发育历史。遗传地层学方法在配置上存在多种选择,但目前尚未有研究测试这些选择如何影响重建质量。为解决这一问题,我们系统考察了不同配置下的重建准确性,测试矩阵涵盖了选择压力、空间结构和生态动力学各异的进化条件。我们综合这些实验结果,提出了一套规范化的遗传地层学最佳实践体系,最终为研究人员选择适用于大规模模拟研究的工具提供指导。