Linear Regression with Unknown Truncation Beyond Gaussian Features - 专知论文

会员服务 ·

0

线性回归 · 算法 · 特征向量 · 正例 · PAC学习理论 ·

Linear Regression with Unknown Truncation Beyond Gaussian Features

翻译：超越高斯特征的未知截断线性回归

Alexandros Kouridakis,Anay Mehrotra,Alkis Kalavasis,Constantine Caramanis

In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^\star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^\star$. This problem has a long history of study in Statistics and Machine Learning going back to the works of (Galton, 1897; Tobin, 1958) and more recently in, e.g., (Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024). Despite this long history, however, most prior works are limited to the special case where $S^\star$ is precisely known. The more practically relevant case, where $S^\star$ is unknown and must be learned from data, remains open: indeed, here the only available algorithms require strong assumptions on the distribution of the feature vectors (e.g., Gaussianity) and, even then, have a $d^{\mathrm{poly} (1/\varepsilon)}$ run time for achieving $\varepsilon$ accuracy. In this work, we give the first algorithm for truncated linear regression with unknown survival set that runs in $\mathrm{poly} (d/\varepsilon)$ time, by only requiring that the feature vectors are sub-Gaussian. Our algorithm relies on a novel subroutine for efficiently learning unions of a bounded number of intervals using access to positive examples (without any negative examples) under a certain smoothness condition. This learning guarantee adds to the line of works on positive-only PAC learning and may be of independent interest.

翻译：在截断线性回归中，仅当结果$y$落入某个未知生存集$S^\star$内时，样本$(x,y)$才会被观测到，目标在于估计未知的$d$维权回归向量$w^\star$。该问题在统计学与机器学习领域历史悠久，可追溯至(Galton, 1897; Tobin, 1958)的研究，近期亦见于(Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024)等工作。然而，尽管有着长期的研究积淀，大部分现有工作局限于$S^\star$精确已知的特殊情形。更具实际相关性的情况——即$S^\star$未知且需从数据中学习——仍悬而未决：当前唯一可用的算法要么对特征向量分布施加严格假设（如高斯性），即便在此条件下，其达到$\varepsilon$精度的运行时间仍为$d^{\mathrm{poly} (1/\varepsilon)}$。本文提出首个针对未知生存集截断线性回归的算法，仅要求特征向量满足次高斯性，即可在$\mathrm{poly} (d/\varepsilon)$时间内完成计算。该算法依赖于一项新颖的子程序，能在特定光滑性条件下，通过仅利用正例（无负例）高效学习有界数量区间的并集。这一学习保证拓展了正例仅有的PAC学习理论框架，可能具有独立研究价值。

0

相关内容

线性回归

线性回归是利用数理统计中回归分析，来确定两种或两种以上变量间相互依赖的定量关系的一种统计分析方法，运用十分广泛。其表达形式为y = w'x+e，e为误差服从均值为0的正态分布。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

【MIT博士论文】高维贝叶斯线性建模:层次建模、推理和评价的进展，250页pdf

【MIT博士论文】高维贝叶斯线性建模:层次建模、推理和评价的进展，250页pdf

专知会员服务

46+阅读 · 2022年10月1日

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

专知会员服务

145+阅读 · 2022年2月2日

最新《高斯过程回归简明教程》，19页pdf

最新《高斯过程回归简明教程》，19页pdf

专知会员服务

73+阅读 · 2020年9月30日

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

专知会员服务

137+阅读 · 2020年7月29日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

74+阅读 · 2020年7月28日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

超越三元组:基于超关系知识图谱嵌入的链接预测，Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

专知会员服务

78+阅读 · 2020年5月11日

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

专知会员服务

33+阅读 · 2020年4月26日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

专知会员服务

51+阅读 · 2020年2月22日

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

专知

26+阅读 · 2022年2月2日

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

专知

68+阅读 · 2020年2月11日

一文看懂线性回归（3个优缺点+8种方法评测）

一文看懂线性回归（3个优缺点+8种方法评测）

AINLP

19+阅读 · 2019年10月16日

一文读懂线性回归、岭回归和Lasso回归

一文读懂线性回归、岭回归和Lasso回归

CSDN

34+阅读 · 2019年10月13日

【机器学习】一文读懂线性回归、岭回归和Lasso回归

【机器学习】一文读懂线性回归、岭回归和Lasso回归

AINLP

20+阅读 · 2019年10月12日

数据分析师应该知道的16种回归技术：偏最小二乘回归

数据分析师应该知道的16种回归技术：偏最小二乘回归

数萃大数据

14+阅读 · 2018年8月29日

线性回归：简单线性回归详解

线性回归：简单线性回归详解

专知

12+阅读 · 2018年3月10日

解开贝叶斯黑暗魔法：通俗理解贝叶斯线性回归

解开贝叶斯黑暗魔法：通俗理解贝叶斯线性回归

专知

13+阅读 · 2018年2月23日

从点到线：逻辑回归到条件随机场

从点到线：逻辑回归到条件随机场

夕小瑶的卖萌屋

15+阅读 · 2017年7月22日

回归预测&时间序列预测

回归预测&时间序列预测

GBASE数据工程部数据团队

44+阅读 · 2017年5月17日

删失数据超高维共线性模型的变量选择

国家自然科学基金

0+阅读 · 2017年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

视觉识别中的实用鲁棒回归技术研究

国家自然科学基金

3+阅读 · 2015年12月31日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于复杂数据的回归模型统计推断及其应用

国家自然科学基金

3+阅读 · 2015年12月31日

几类非线性发展方程和方程组解的性质研究

国家自然科学基金

2+阅读 · 2015年12月31日

大型稀疏非对称线性方程组的归纳降维算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

相依回归模型与扩散过程的统计推断及其应用

国家自然科学基金

1+阅读 · 2014年12月31日

复杂非线性椭圆问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类非线性发展方程的定性理论

国家自然科学基金

0+阅读 · 2014年12月31日

Debiased Inference for High-Dimensional Regression Models Based on Profile M-Estimation

Arxiv

0+阅读 · 6月15日

Wild bootstrap for mean response inference in functional linear regression models

Arxiv

0+阅读 · 6月15日

Proper Agnostic Learning of Functions of Halfspaces under Gaussian Marginals

Arxiv

0+阅读 · 5月26日

High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile

Arxiv

0+阅读 · 5月19日

High-dimensional ridge regression with random features for non-identically distributed data with a variance profile

Arxiv

0+阅读 · 5月18日

Efficient distributional regression trees learning algorithms for calibrated non-parametric probabilistic forecasts

Arxiv

0+阅读 · 5月13日

Valid F-screening in linear regression

Arxiv

0+阅读 · 5月8日

Efficient Statistics With Unknown Truncation, Polynomial Time Algorithms, Beyond Gaussians

Arxiv

0+阅读 · 5月8日

What Can Be Recovered Under Sparse Adversarial Corruption? Assumption-Free Theory for Linear Measurements

Arxiv

0+阅读 · 5月6日

Gimbal Regression: Orientation-Adaptive Local Linear Regression under Spatial Heterogeneity

Arxiv

0+阅读 · 3月29日

VIP会员

文章信息

相关主题

PAC学习理论

最新内容

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

1+阅读 · 今天15:02

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

1+阅读 · 今天15:00

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

2+阅读 · 今天14:30

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

2+阅读 · 今天14:05

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

2+阅读 · 今天13:55

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

2+阅读 · 今天13:51

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

2+阅读 · 今天13:48

美国从乌克兰无人机战争中学习经验

美国从乌克兰无人机战争中学习经验

专知会员服务

7+阅读 · 6月21日

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

专知会员服务

5+阅读 · 6月21日

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

专知会员服务

7+阅读 · 6月21日

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

专知会员服务

20+阅读 · 6月20日

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

5+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

8+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

7+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

9+阅读 · 6月18日

相关VIP内容

【MIT博士论文】高维贝叶斯线性建模:层次建模、推理和评价的进展，250页pdf

【MIT博士论文】高维贝叶斯线性建模:层次建模、推理和评价的进展，250页pdf

专知会员服务

46+阅读 · 2022年10月1日

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

专知会员服务

145+阅读 · 2022年2月2日

最新《高斯过程回归简明教程》，19页pdf

最新《高斯过程回归简明教程》，19页pdf

专知会员服务

73+阅读 · 2020年9月30日

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

最新《图神经网络知识图谱补全综述论文》A Survey on Graph Neural Networks for Knowledge Graph Completion

专知会员服务

137+阅读 · 2020年7月29日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

74+阅读 · 2020年7月28日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

超越三元组:基于超关系知识图谱嵌入的链接预测，Beyond Triplets: Hyper-Relational Knowledge Graph Embedding for Link Prediction

专知会员服务

78+阅读 · 2020年5月11日

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

随机特征核近似综述: 算法与理论，Random Features for Kernel Approximation: A Survey in Algorithms, Theory, and Beyond

专知会员服务

33+阅读 · 2020年4月26日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

专知会员服务

51+阅读 · 2020年2月22日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 3D场景图：开放挑战与未来方向

21世纪的无人机战争

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

相关资讯

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

【2022新书】用回归来解决比较、估计、预测和因果推断的实际问题，546页pdf

专知

26+阅读 · 2022年2月2日

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

最新「因果推断Causal Inference」综述论文38页pdf，阿里巴巴、Buffalo、Georgia、Virginia

专知

68+阅读 · 2020年2月11日

一文看懂线性回归（3个优缺点+8种方法评测）

一文看懂线性回归（3个优缺点+8种方法评测）

AINLP

19+阅读 · 2019年10月16日

一文读懂线性回归、岭回归和Lasso回归

一文读懂线性回归、岭回归和Lasso回归

CSDN

34+阅读 · 2019年10月13日

【机器学习】一文读懂线性回归、岭回归和Lasso回归

【机器学习】一文读懂线性回归、岭回归和Lasso回归

AINLP

20+阅读 · 2019年10月12日

数据分析师应该知道的16种回归技术：偏最小二乘回归

数据分析师应该知道的16种回归技术：偏最小二乘回归

数萃大数据

14+阅读 · 2018年8月29日

线性回归：简单线性回归详解

线性回归：简单线性回归详解

专知

12+阅读 · 2018年3月10日

解开贝叶斯黑暗魔法：通俗理解贝叶斯线性回归

解开贝叶斯黑暗魔法：通俗理解贝叶斯线性回归

专知

13+阅读 · 2018年2月23日

从点到线：逻辑回归到条件随机场

从点到线：逻辑回归到条件随机场

夕小瑶的卖萌屋

15+阅读 · 2017年7月22日

回归预测&时间序列预测

回归预测&时间序列预测

GBASE数据工程部数据团队

44+阅读 · 2017年5月17日

相关论文

Debiased Inference for High-Dimensional Regression Models Based on Profile M-Estimation

Arxiv

0+阅读 · 6月15日

Wild bootstrap for mean response inference in functional linear regression models

Arxiv

0+阅读 · 6月15日

Proper Agnostic Learning of Functions of Halfspaces under Gaussian Marginals

Arxiv

0+阅读 · 5月26日

High-dimensional analysis of ridge regression for non-identically distributed data with a variance profile

Arxiv

0+阅读 · 5月19日

High-dimensional ridge regression with random features for non-identically distributed data with a variance profile

Arxiv

0+阅读 · 5月18日

Efficient distributional regression trees learning algorithms for calibrated non-parametric probabilistic forecasts

Arxiv

0+阅读 · 5月13日

Valid F-screening in linear regression

Arxiv

0+阅读 · 5月8日

Efficient Statistics With Unknown Truncation, Polynomial Time Algorithms, Beyond Gaussians

Arxiv

0+阅读 · 5月8日

What Can Be Recovered Under Sparse Adversarial Corruption? Assumption-Free Theory for Linear Measurements

Arxiv

0+阅读 · 5月6日

Gimbal Regression: Orientation-Adaptive Local Linear Regression under Spatial Heterogeneity

Arxiv

0+阅读 · 3月29日

相关基金

删失数据超高维共线性模型的变量选择

国家自然科学基金

0+阅读 · 2017年12月31日

测量误差数据下部分线性模型有约束统计推断理论

国家自然科学基金

2+阅读 · 2015年12月31日

视觉识别中的实用鲁棒回归技术研究

国家自然科学基金

3+阅读 · 2015年12月31日

高维回归模型的预测稳定性研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于复杂数据的回归模型统计推断及其应用

国家自然科学基金

3+阅读 · 2015年12月31日

几类非线性发展方程和方程组解的性质研究

国家自然科学基金

2+阅读 · 2015年12月31日

大型稀疏非对称线性方程组的归纳降维算法研究

国家自然科学基金

0+阅读 · 2015年12月31日

相依回归模型与扩散过程的统计推断及其应用

国家自然科学基金

1+阅读 · 2014年12月31日

复杂非线性椭圆问题研究

国家自然科学基金

0+阅读 · 2014年12月31日

一类非线性发展方程的定性理论

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员