The two-dimensional track of an animal on a landscape has progressed over the past three decades from hourly to second-by-second recordings of locations. Track segmentation methods for analyzing the behavioral information in such relocation data has lagged somewhat behind, with scales of analysis currently at the sub-hourly to minute level. A new approach is needed to bring segmentation analysis down to a second-by-second level. Here, such an approach is presented that rests heavily on concepts from Shannon's Information Theory. In this paper, we first briefly review and update concepts relating to movement path segmentation. We then discuss how cluster analysis can be used to organize the smallest viable statistical movement elements (StaMEs), which are $\mu$ steps long, and to code the next level of movement elements called ``words'' that are $m \mu$ steps long. Centroids of these word clusters are identified as canonical activity modes (CAMs). Unlike current segmentation schemes, the approach presented here allows us to provide entropy measures for movement paths, compute the coding efficiencies of derived StaMEs and CAMs, and assess error rates in the allocation of strings of $m$ StaMEs to CAM types. In addition our approach allows us to employ the Jensen-Shannon divergence measure to assess and compare the best choices for the various parameters (number of steps in a StaME, number of StaME types, number of StaMEs in a word, number of CAM types), as well as the best clustering methods for generating segments that can then be used to interpret and predict sequences of higher order segments. The theory presented here provides another tool in our toolbox for dealing with the effects of global change on the movement and redistribution of animals across altered landscapes
翻译:过去三十年间,动物在景观中运动的二维轨迹记录已从每小时定位发展为每秒级定位。分析此类重定位数据中行为信息的轨迹分割方法发展相对滞后,当前分析尺度仍停留在次小时至分钟级。亟需新方法将分割分析推进至秒级。本文提出一种基于香农信息论核心概念的新方法。首先简要回顾并更新了运动路径分割的相关概念。继而探讨如何利用聚类分析组织最小可行统计运动单元(StaMEs),每个单元包含$\mu$个步长,并对下一级命名为"词"(包含$m \mu$个步长)的运动单元进行编码。这些词簇的质心被定义为典型活动模式(CAMs)。与现有分割方案不同,本方法可提供运动路径的熵测度,计算衍生StaMEs与CAMs的编码效率,并评估将长度为$m$个StaME的字符串分配至CAM类型时的误差率。此外,本方法采用詹森-香农散度测度评估并比较各类参数(StaME步长数、StaME类型数、词内StaME数量、CAM类型数)的最佳选择,以及用于生成可解释与预测高阶段序列的最佳聚类方法。本文提出的理论为应对全球变化对动物在景观变迁中的运动与再分布影响提供了另一重要分析工具。