This paper focuses on the problem of semi-supervised domain adaptation for time-series forecasting, which is underexplored in literatures, despite being often encountered in practice. Existing methods on time-series domain adaptation mainly follow the paradigm designed for the static data, which cannot handle domain-specific complex conditional dependencies raised by data offset, time lags, and variant data distributions. In order to address these challenges, we analyze variational conditional dependencies in time-series data and find that the causal structures are usually stable among domains, and further raise the causal conditional shift assumption. Enlightened by this assumption, we consider the causal generation process for time-series data and propose an end-to-end model for the semi-supervised domain adaptation problem on time-series forecasting. Our method can not only discover the Granger-Causal structures among cross-domain data but also address the cross-domain time-series forecasting problem with accurate and interpretable predicted results. We further theoretically analyze the superiority of the proposed method, where the generalization error on the target domain is bounded by the empirical risks and by the discrepancy between the causal structures from different domains. Experimental results on both synthetic and real data demonstrate the effectiveness of our method for the semi-supervised domain adaptation method on time-series forecasting.
翻译:本文聚焦于半监督领域自适应在时间序列预测中的应用问题,尽管该问题在实际中频繁出现,但在现有文献中尚未得到充分探索。现有时间序列领域自适应方法主要遵循为静态数据设计的范式,无法处理由数据偏移、时间滞后及分布变化引发的领域特异性复杂条件依赖关系。为应对这些挑战,我们分析了时间序列数据中的变分条件依赖关系,发现因果结构在领域间通常保持稳定,并由此提出因果条件偏移假设。受此假设启发,我们考虑时间序列数据的因果生成过程,提出了一种用于时间序列预测半监督领域自适应的端到端模型。该方法不仅能发现跨域数据间的格兰杰因果结构,还能以准确且可解释的预测结果解决跨域时间序列预测问题。我们从理论上分析了该方法的优越性,证明目标域的泛化误差受限于经验风险及不同领域间因果结构的差异。在合成数据与真实数据上的实验结果均验证了该方法在半监督时间序列预测领域自适应任务中的有效性。