Industrial recommendation systems typically involve multiple scenarios, yet existing cross-domain (CDR) and multi-scenario (MSR) methods often require prohibitive resources and strict input alignment, limiting their extensibility. We propose MTFM (Meituan Foundation Model for Recommendation), a transformer-based framework that addresses these challenges. Instead of pre-aligning inputs, MTFM transforms cross-domain data into heterogeneous tokens, capturing multi-scenario knowledge in an alignment-free manner. To enhance efficiency, we first introduce a multi-scenario user-level sample aggregation that significantly enhances training throughput by reducing the total number of instances. We further integrate Grouped-Query Attention and a customized Hybrid Target Attention to minimize memory usage and computational complexity. Furthermore, we implement various system-level optimizations, such as kernel fusion and the elimination of CPU-GPU blocking, to further enhance both training and inference throughput. Offline and online experiments validate the effectiveness of MTFM, demonstrating that significant performance gains are achieved by scaling both model capacity and multi-scenario training data.
翻译:工业推荐系统通常涉及多个场景,然而现有的跨域推荐(CDR)与多场景推荐(MSR)方法往往需要极高的资源消耗和严格的输入对齐,限制了其可扩展性。我们提出了MTFM(美团推荐基础模型),这是一个基于Transformer的框架,旨在应对上述挑战。MTFM无需预先对齐输入,而是将跨域数据转换为异构令牌,以免对齐的方式捕获多场景知识。为提升效率,我们首先引入了一种多场景用户级样本聚合方法,通过减少实例总数显著提升了训练吞吐量。我们进一步集成了分组查询注意力(Grouped-Query Attention)与定制的混合目标注意力(Hybrid Target Attention),以最小化内存占用和计算复杂度。此外,我们实施了多种系统级优化,例如内核融合和消除CPU-GPU阻塞,从而进一步提升了训练和推理吞吐量。离线和在线实验验证了MTFM的有效性,结果表明通过扩展模型容量和多场景训练数据,可以取得显著的性能提升。