Table learning, which lies at the intersection of machine learning and modern database systems, has recently attracted growing attention. However, existing table learning frameworks typically require explicit data export and extensive feature engineering, creating a high barrier for database practitioners. We present TLSQL (Table Learning Structured Query Language), a system that enables table learning directly over relational databases via SQL-like declarative specifications. TLSQL is implemented as a lightweight Python library that translates these specifications into standard SQL queries and structured learning task descriptions. The generated SQL queries are executed natively by the database engine, while the task descriptions are consumed by downstream table learning frameworks. This design allows users to focus on modeling and analysis rather than low-level data preparation and pipeline orchestration. Experiments on real-world datasets demonstrate that TLSQL effectively lowers the barrier to integrating machine learning into database-centric workflows. Our code is available at https://github.com/rllm-project/tlsql/.
翻译:表学习作为机器学习与现代数据库系统的交叉领域,近年来日益受到关注。然而,现有的表学习框架通常需要显式的数据导出和大量的特征工程,这为数据库从业者带来了较高的使用门槛。本文提出TLSQL(Table Learning Structured Query Language),该系统通过类SQL的声明式规范,支持直接在关系型数据库上进行表学习。TLSQL实现为一个轻量级Python库,能够将这些规范转换为标准SQL查询和结构化学习任务描述。生成的SQL查询由数据库引擎原生执行,而任务描述则由下游表学习框架解析使用。这一设计让用户能够专注于建模与分析,而无需处理底层的数据准备与流程编排。在真实数据集上的实验表明,TLSQL有效降低了将机器学习集成到以数据库为中心的工作流中的门槛。我们的代码已发布于https://github.com/rllm-project/tlsql/。