The history of user behaviors constitutes one of the most significant characteristics in predicting the click-through rate (CTR), owing to their strong semantic and temporal correlation with the target item. While the literature has individually examined each of these correlations, research has yet to analyze them in combination, that is, the quadruple correlation of (behavior semantics, target semantics, behavior temporal, and target temporal). The effect of this correlation on performance and the extent to which existing methods learn it remain unknown. To address this gap, we empirically measure the quadruple correlation and observe intuitive yet robust quadruple patterns. We measure the learned correlation of several representative user behavior methods, but to our surprise, none of them learn such a pattern, especially the temporal one. In this paper, we propose the Temporal Interest Network (TIN) to capture the quadruple semantic and temporal correlation between behaviors and the target. We achieve this by incorporating target-aware temporal encoding, in addition to semantic embedding, to represent behaviors and the target. Furthermore, we deploy target-aware attention, along with target-aware representation, to explicitly conduct the 4-way interaction. We performed comprehensive evaluations on the Amazon and Alibaba datasets. Our proposed TIN outperforms the best-performing baselines by 0.43\% and 0.29\% on two datasets, respectively. Comprehensive analysis and visualization show that TIN is indeed capable of learning the quadruple correlation effectively, while all existing methods fail to do so. We provide our implementation of TIN in Tensorflow.
翻译:用户行为历史因与目标物品存在强语义和时序关联,成为点击率预测中最显著的特征之一。现有文献虽已分别研究这些关联,但尚未分析其联合作用——即(行为语义、目标语义、行为时序、目标时序)的四元关联。这种关联对性能的影响及现有方法的学习程度尚不明确。为填补这一空白,我们通过实证测量四元关联,观察到直观且稳健的四元模式。我们测量了多种代表性用户行为方法学习到的关联,但令人惊讶的是,所有方法均未学习到该模式,尤其是时序关联。本文提出时序兴趣网络(Temporal Interest Network,TIN)以捕捉行为与目标之间的四元语义及时序关联。我们通过在语义嵌入基础上引入目标感知时序编码来表示行为与目标,并部署目标感知注意力机制与目标感知表征,显式执行四维交互。在Amazon和Alibaba数据集上的综合评估显示,我们提出的TIN在两个数据集上分别以0.43%和0.29%的绝对优势超越最优基线模型。综合分析及可视化表明,TIN能有效学习四元关联,而现有方法均无法实现。我们已在Tensorflow上开源TIN实现。