In transportation, Weigh-in motion (WIM) stations, Electronic Toll Collection (ETC) systems, Closed-circuit Television (CCTV) are widely deployed to collect data at different locations. Vehicle re-identification, by matching the same vehicle at different locations, is helpful in understanding the long-distance journey patterns. In this paper, the potential hazards of ignoring the survivorship bias effects are firstly identified and analyzed using a truncated distribution over a 2-dimensional time-time domain. Given journey time modeled as Exponential or Weibull distribution, Maximum Likelihood Estimation (MLE), Fisher Information (F.I.) and Bootstrap methods are formulated to estimate the parameter of interest and their confidence intervals. Besides formulating journey time distributions, an automated framework querying the observable time-time scope are proposed. For complex distributions (e.g, three parameter Weibull), distributions are modeled in PyTorch to automatically find first and second derivatives and estimated results. Three experiments are designed to demonstrate the effectiveness of the proposed method. In conclusion, the paper describes a very unique aspects in understanding and analyzing traffic status. Although the survivorship bias effects are not recognized and long-ignored, by accurately describing travel time over time-time domain, the proposed approach have potentials in travel time reliability analysis, understanding logistics systems, modeling/predicting product lifespans, etc.
翻译:在交通领域,动态称重站、电子收费系统及闭路电视监控系统被广泛部署于不同位置以采集数据。通过匹配不同位置的同一车辆进行车辆重识别,有助于理解长距离出行模式。本文首先利用二维时间-时间域上的截断分布,识别并分析了忽略生存者偏差效应可能带来的潜在风险。在将行程时间建模为指数分布或威布尔分布的前提下,构建了最大似然估计、费希尔信息及自助法,用于估计目标参数及其置信区间。除行程时间分布建模外,还提出了可自动查询可观测时间-时间范围的框架。针对复杂分布(如三参数威布尔分布),使用PyTorch建模以实现一阶与二阶导数的自动求解及参数估计。通过设计三项实验验证了所提方法的有效性。本文揭示了交通状态理解与分析中一个独特的研究维度。尽管生存者偏差效应长期未被充分认识,但通过精确描述时间-时间域上的行程时间,本方法在行程时间可靠性分析、物流系统解析、产品寿命建模与预测等领域具有应用潜力。