Train, Retrieve, or Both? A Four-Arm Head-to-Head for Correct Statutory Citation on the Ontario Residential Tenancies Act - 专知论文

会员服务 ·

0

情景 · MoDELS · 基 · 得分 · 边 ·

Train, Retrieve, or Both? A Four-Arm Head-to-Head for Correct Statutory Citation on the Ontario Residential Tenancies Act

翻译：暂无翻译

Ali Asaria,Tony Salomone,Deep Gandhi

Self-represented tenants, landlords, and help-desk staff need to be pointed at the provision of law that actually governs a question, with a correct statutory citation. We study this task on the Ontario Residential Tenancies Act, 2006 (RTA) and its core regulation, asking the operator's question empirically: is fine-tuning enough, or is hybrid retrieval needed? We run a four-arm head-to-head on Qwen2.5-7B-Instruct (base zero-shot, LoRA SFT-only, RAG-only, and an SFT+RAG hybrid), scored on citation exact-match (section+subsection) over a small, human-verification-pending real eval set. The base model cannot cite the RTA and SFT-only mis-recalls sections; retrieval is essential and drives hallucination to zero by construction; and the SFT+RAG hybrid scores highest at 0.481 exact-match with zero hallucinated citations. Its edge comes from SFT making provision selection more robust to the higher-recall candidate sets that hurt zero-shot RAG. Notably, this cheap bge-small hybrid matches or beats a pipeline built on bigger, specialized retrieval models (a larger embedder and a cross-encoder reranker), and a larger/improved training set does not help either: strong statutory-citation performance here does not require specialized retrieval models or more data. The artifact zeroes hallucination and clears the lift-over-base bar but does not reach the aspirational 0.70 exact-match target. All results are on a small, human-verification-pending real eval set and are reported as preliminary.

翻译：暂无翻译

0

相关内容

《警力保护 (FP) 的战术规划考虑因素》45页slides

《警力保护 (FP) 的战术规划考虑因素》45页slides

专知会员服务

11+阅读 · 2025年1月14日

《使用自动化技术的军队的问责面》

《使用自动化技术的军队的问责面》

专知会员服务

15+阅读 · 2024年9月12日

博士论文《设计思维与军事决策过程的关系研究》170页，南佛罗里达大学

博士论文《设计思维与军事决策过程的关系研究》170页，南佛罗里达大学

专知会员服务

97+阅读 · 2023年3月14日

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

专知会员服务

16+阅读 · 2020年7月20日

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

专知会员服务

17+阅读 · 2020年5月19日

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

专知会员服务

26+阅读 · 2020年3月19日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

101+阅读 · 2019年11月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

一文理解Ranking Loss/Margin Loss/Triplet Loss

一文理解Ranking Loss/Margin Loss/Triplet Loss

极市平台

16+阅读 · 2020年8月10日

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

泡泡机器人SLAM

34+阅读 · 2019年9月18日

从One-hot, Word embedding到Transformer，一步步教你理解Bert

从One-hot, Word embedding到Transformer，一步步教你理解Bert

AI100

15+阅读 · 2019年6月25日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

神圣的NLP！一文理解词性标注、依存分析和命名实体识别任务

神圣的NLP！一文理解词性标注、依存分析和命名实体识别任务

深度学习与NLP

25+阅读 · 2018年8月22日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

基于管制通话语音个体特征的管制员不良工作状态识别方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向二进制程序的静态结构化符号执行与动态组合方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

问责机制何以奏效？面向公共部门政策执行的实证研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向汉语-泰语跨语言新闻事件检索方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

考虑多方防灾努力的保险合同及优化策略：以古建筑群火灾为背景

国家自然科学基金

0+阅读 · 2014年12月31日

基于多资源视角的柔性任务集装箱接驳（Drayage）运输的调度方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

政府、银行和房地产的合作与冲突- - 基于动态博弈视角的房价调控均衡政策探索

国家自然科学基金

1+阅读 · 2014年12月31日

自动化集装箱码头装卸作业的时空同步策略与优化方法

国家自然科学基金

1+阅读 · 2014年12月31日

不确定环境下基于HTN的应急任务规划方法研究

国家自然科学基金

15+阅读 · 2012年12月31日

基于训练效果的部队作战效能评估及作战计划制订方法研究

国家自然科学基金

96+阅读 · 2009年12月31日

Stabilizing the Q-Gradient Field for Policy Smoothness in Actor-Critic Methods

Arxiv

0+阅读 · 6月18日

LAGO Policy: Latency-Aware Asynchronous Diffusion Policies with Goal-Directed Collision-Free Planning for Smooth Manipulation

Arxiv

0+阅读 · 6月16日

JOIN: Anchor-Grasp-Conditioned Joining via Opposition, Inference, and Navigation for Bimanual Assistive Manipulation

Arxiv

0+阅读 · 6月9日

Same Weights, Different Robot: A Deployment Safety View of VLA Policies

Arxiv

0+阅读 · 6月2日

Do Neural Retrievers Prefer Certain Documents? Evidence of Learned Relevance Priors

Arxiv

0+阅读 · 6月1日

Are Algorithm Registers Transparent? Perspectives from Germany

Arxiv

0+阅读 · 6月1日

Federated Formal Verification: Cross-Backend Citation, Cross-Axis Convergence, and AI-Orchestrated Proof Dispatch for Production Systems

Arxiv

0+阅读 · 6月1日

Concave is the New Linear: The Impossibility of Anti-Plutocratic DAO Governance

Arxiv

0+阅读 · 5月18日

Temporal Decay of Co-Citation Predictability: A 20-Year Statute Retrieval Benchmark from 396M Ukrainian Court Citations

Arxiv

0+阅读 · 5月17日

Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why

Arxiv

0+阅读 · 5月11日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

1+阅读 · 今天15:02

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

1+阅读 · 今天15:00

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

2+阅读 · 今天14:30

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

2+阅读 · 今天14:05

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

2+阅读 · 今天13:55

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

2+阅读 · 今天13:51

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

2+阅读 · 今天13:48

美国从乌克兰无人机战争中学习经验

美国从乌克兰无人机战争中学习经验

专知会员服务

7+阅读 · 6月21日

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

专知会员服务

5+阅读 · 6月21日

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

专知会员服务

7+阅读 · 6月21日

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

专知会员服务

20+阅读 · 6月20日

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

5+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

8+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

7+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

9+阅读 · 6月18日

相关VIP内容

《警力保护 (FP) 的战术规划考虑因素》45页slides

《警力保护 (FP) 的战术规划考虑因素》45页slides

专知会员服务

11+阅读 · 2025年1月14日

《使用自动化技术的军队的问责面》

《使用自动化技术的军队的问责面》

专知会员服务

15+阅读 · 2024年9月12日

博士论文《设计思维与军事决策过程的关系研究》170页，南佛罗里达大学

博士论文《设计思维与军事决策过程的关系研究》170页，南佛罗里达大学

专知会员服务

97+阅读 · 2023年3月14日

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

【论文推荐】针对公民投诉的时空分类法标签推荐 STAR: Spatio-Temporal Taxonomy-Aware Tag Recommendation for Citizen Complaints

专知会员服务

16+阅读 · 2020年7月20日

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

专知会员服务

17+阅读 · 2020年5月19日

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

【预训练论文】预训练Transformer校准，Calibration of Pre-trained Transformers

专知会员服务

26+阅读 · 2020年3月19日

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

【Yoshua Bengio新论文】多任务自监督学习语音识别，MULTI-TASK SELF-SUPERVISED LEARNING FOR ROBUST SPEECH RECOGNITION

专知会员服务

39+阅读 · 2020年1月30日

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

【行为识别| 2019最新综述】时空动作识别综述（Spatio-temporal Action Recognition: A Survey），附15页PDF

专知会员服务

101+阅读 · 2019年11月23日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 3D场景图：开放挑战与未来方向

21世纪的无人机战争

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

相关资讯

一文理解Ranking Loss/Margin Loss/Triplet Loss

一文理解Ranking Loss/Margin Loss/Triplet Loss

极市平台

16+阅读 · 2020年8月10日

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

【泡泡图灵智库】ContextDesc：用跨模态上下文增强的局部描述子

泡泡机器人SLAM

34+阅读 · 2019年9月18日

从One-hot, Word embedding到Transformer，一步步教你理解Bert

从One-hot, Word embedding到Transformer，一步步教你理解Bert

AI100

15+阅读 · 2019年6月25日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Single-Shot Object Detection with Enriched Semantics

Single-Shot Object Detection with Enriched Semantics

统计学习与视觉计算组

14+阅读 · 2018年8月29日

神圣的NLP！一文理解词性标注、依存分析和命名实体识别任务

神圣的NLP！一文理解词性标注、依存分析和命名实体识别任务

深度学习与NLP

25+阅读 · 2018年8月22日

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

【论文推荐】最新九篇自动问答相关论文—可解释推理网络、上下文知识图谱嵌入、注意力RNN、Multi-Cast注意力网络

专知

15+阅读 · 2018年6月29日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

相关论文

Stabilizing the Q-Gradient Field for Policy Smoothness in Actor-Critic Methods

Arxiv

0+阅读 · 6月18日

LAGO Policy: Latency-Aware Asynchronous Diffusion Policies with Goal-Directed Collision-Free Planning for Smooth Manipulation

Arxiv

0+阅读 · 6月16日

JOIN: Anchor-Grasp-Conditioned Joining via Opposition, Inference, and Navigation for Bimanual Assistive Manipulation

Arxiv

0+阅读 · 6月9日

Same Weights, Different Robot: A Deployment Safety View of VLA Policies

Arxiv

0+阅读 · 6月2日

Do Neural Retrievers Prefer Certain Documents? Evidence of Learned Relevance Priors

Arxiv

0+阅读 · 6月1日

Are Algorithm Registers Transparent? Perspectives from Germany

Arxiv

0+阅读 · 6月1日

Federated Formal Verification: Cross-Backend Citation, Cross-Axis Convergence, and AI-Orchestrated Proof Dispatch for Production Systems

Arxiv

0+阅读 · 6月1日

Concave is the New Linear: The Impossibility of Anti-Plutocratic DAO Governance

Arxiv

0+阅读 · 5月18日

Temporal Decay of Co-Citation Predictability: A 20-Year Statute Retrieval Benchmark from 396M Ukrainian Court Citations

Arxiv

0+阅读 · 5月17日

Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why

Arxiv

0+阅读 · 5月11日

相关基金

基于管制通话语音个体特征的管制员不良工作状态识别方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

面向二进制程序的静态结构化符号执行与动态组合方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

问责机制何以奏效？面向公共部门政策执行的实证研究

国家自然科学基金

1+阅读 · 2014年12月31日

面向汉语-泰语跨语言新闻事件检索方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

考虑多方防灾努力的保险合同及优化策略：以古建筑群火灾为背景

国家自然科学基金

0+阅读 · 2014年12月31日

基于多资源视角的柔性任务集装箱接驳（Drayage）运输的调度方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

政府、银行和房地产的合作与冲突- - 基于动态博弈视角的房价调控均衡政策探索

国家自然科学基金

1+阅读 · 2014年12月31日

自动化集装箱码头装卸作业的时空同步策略与优化方法

国家自然科学基金

1+阅读 · 2014年12月31日

不确定环境下基于HTN的应急任务规划方法研究

国家自然科学基金

15+阅读 · 2012年12月31日

基于训练效果的部队作战效能评估及作战计划制订方法研究

国家自然科学基金

96+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员