QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

In light of the success of the pre-trained language models (PLMs), continual pre-training of generic PLMs has been the paradigm of domain adaption. In this paper, we propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search. QUERT is jointly trained on four tailored pre-training tasks to the characteristics of query in travel domain search: Geography-aware Mask Prediction, Geohash Code Prediction, User Click Behavior Learning, and Phrase and Token Order Prediction. Performance improvement of downstream tasks and ablation experiment demonstrate the effectiveness of our proposed pre-training tasks. To be specific, the average performance of downstream tasks increases by 2.02% and 30.93% in supervised and unsupervised settings, respectively. To check on the improvement of QUERT to online business, we deploy QUERT and perform A/B testing on Fliggy APP. The feedback results show that QUERT increases the Unique Click-Through Rate and Page Click-Through Rate by 0.89% and 1.03% when applying QUERT as the encoder. Our code and downstream task data will be released for future research.

翻译：鉴于预训练语言模型（PLMs）的成功，通用PLMs的持续预训练已成为领域适应的标准范式。本文提出QUERT，一种面向旅行领域搜索查询理解的持续预训练语言模型。QUERT通过四项针对旅行领域搜索查询特征的定制预训练任务进行联合训练：地理感知掩码预测、地理哈希码预测、用户点击行为学习以及短语与词元顺序预测。下游任务性能提升与消融实验验证了我们提出的预训练任务的有效性。具体而言，在有监督和无监督设置下，下游任务的平均性能分别提升了2.02%和30.93%。为验证QUERT对在线业务的实际改进效果，我们在飞猪APP上部署QUERT并开展A/B测试。反馈结果显示，当使用QUERT作为编码器时，独立点击率和页面点击率分别提升了0.89%和1.03%。我们的代码及下游任务数据将公开以供后续研究。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日