The two-dimensional track of an animal on a landscape has progressed over the past three decades from hourly to second-by-second recordings of locations. Track segmentation methods for analyzing the behavioral information in such relocation data has lagged somewhat behind, with scales of analysis currently at the sub-hourly to minute level. A new approach is needed to bring segmentation analysis down to a second-by-second level. Here, such an approach is presented that rests heavily on concepts from Shannon's Information Theory. In this paper, we first briefly review and update concepts relating to movement path segmentation. We then discuss how cluster analysis can be used to organize the smallest viable statistical movement elements (StaMEs), which are $\mu$ steps long, and to code the next level of movement elements called ``words'' that are $m \mu$ steps long. Centroids of these word clusters are identified as canonical activity modes (CAMs). Unlike current segmentation schemes, the approach presented here allows us to provide entropy measures for movement paths, compute the coding efficiencies of derived StaMEs and CAMs, and assess error rates in the allocation of strings of $m$ StaMEs to CAM types. In addition our approach allows us to employ the Jensen-Shannon divergence measure to assess and compare the best choices for the various parameters (number of steps in a StaME, number of StaME types, number of StaMEs in a word, number of CAM types), as well as the best clustering methods for generating segments that can then be used to interpret and predict sequences of higher order segments. The theory presented here provides another tool in our toolbox for dealing with the effects of global change on the movement and redistribution of animals across altered landscapes
翻译:二维空间中的动物运动轨迹在过去三十年间从每小时记录发展到逐秒记录。然而,用于分析此类重定位数据中行为信息的轨迹分割方法相对滞后,目前分析尺度仍停留在亚小时至分钟级别。亟需一种新方法将分割分析推进至秒级尺度。本文提出了一种基于香农信息论概念的方法。首先,我们简要回顾并更新了与运动路径分割相关的概念。随后,探讨如何利用聚类分析组织最小的可行统计运动单元(StaMEs),其长度为μ步,并对下一级运动单元进行编码,即长度为mμ步的“词”。这些词簇的质心被识别为标准活动模式(CAMs)。与现有分割方案不同,本文提出的方法能够计算运动路径的熵值,推导出的StaMEs和CAMs的编码效率,以及评估将长度为m的StaME字符串分配给CAM类型时的错误率。此外,该方法还可运用詹森-香农散度来评估和比较各参数(StaME步数、StaME类型数、词中StaME数量、CAM类型数)的最优选择,以及选择最佳聚类方法以生成可用于解释和预测高阶段序列的片段。本文提出的理论为应对全球变化对动物运动及在景观改变后的重新分布影响提供了新的分析工具。