Tabular data drive most real-world machine learning applications, yet building general-purpose models for them remains difficult. Mixed numeric and categorical fields, weak feature structure, and limited labeled data make scaling and generalization challenging. To this end, we introduce Orion-Bix, a tabular foundation model that combines biaxial attention with meta-learned in-context reasoning for few-shot tabular learning. Its encoder alternates standard, grouped, hierarchical, and relational attention, fusing their outputs through multi-CLS summarization to capture both local and global dependencies efficiently. A label-aware ICL head adapts on the fly and scales to large label spaces via hierarchical decision routing. Meta-trained on synthetically generated, structurally diverse tables with causal priors, Orion-Bix learns transferable inductive biases across heterogeneous data. Delivered as a scikit-learn compatible foundation model, it outperforms gradient-boosting baselines and remains competitive with state-of-the-art tabular foundation models on public benchmarks, showing that biaxial attention with episodic meta-training enables robust, few-shot-ready tabular learning. The model is publicly available at https://github.com/Lexsi-Labs/Orion-BiX .
翻译:表格数据驱动着大多数现实世界的机器学习应用,然而为其构建通用模型仍然困难重重。数值型与类别型字段混合、特征结构薄弱以及标记数据有限,使得模型的扩展与泛化面临挑战。为此,我们提出了Orion-BiX,一种表格基础模型,它将双轴注意力机制与元学习上下文推理相结合,用于少样本表格学习。其编码器交替使用标准注意力、分组注意力、层次注意力与关系注意力,并通过多CLS汇总融合它们的输出,从而高效捕获局部与全局依赖关系。一个标签感知的上下文学习头部能够即时适应,并通过层次化决策路由扩展至大规模标签空间。通过在具有因果先验、结构多样的合成生成表格上进行元训练,Orion-BiX能够学习跨异构数据的可迁移归纳偏置。该模型以兼容scikit-learn的基础模型形式提供,在公开基准测试中超越了梯度提升基线模型,并与最先进的表格基础模型保持竞争力,这表明结合情景式元训练的双轴注意力机制能够实现鲁棒的、少样本就绪的表格学习。模型公开地址为:https://github.com/Lexsi-Labs/Orion-BiX。