Retire: Robust Expectile Regression in High Dimensions - 专知论文

会员服务 ·

0

稳健性 · 异方差 · 估计/估计量 · Oracle · 可约的 ·

2023 年 3 月 22 日

Retire: Robust Expectile Regression in High Dimensions

翻译：退休: 高维数据中的稳健期望分位数回归

Rebeka Man,Kean Ming Tan,Zian Wang,Wen-Xin Zhou

High-dimensional data can often display heterogeneity due to heteroscedastic variance or inhomogeneous covariate effects. Penalized quantile and expectile regression methods offer useful tools to detect heteroscedasticity in high-dimensional data. The former is computationally challenging due to the non-smooth nature of the check loss, and the latter is sensitive to heavy-tailed error distributions. In this paper, we propose and study (penalized) robust expectile regression (retire), with a focus on iteratively reweighted $\ell_1$-penalization which reduces the estimation bias from $\ell_1$-penalization and leads to oracle properties. Theoretically, we establish the statistical properties of the retire estimator under two regimes: (i) low-dimensional regime in which $d \ll n$; (ii) high-dimensional regime in which $s\ll n\ll d$ with $s$ denoting the number of significant predictors. In the high-dimensional setting, we carefully characterize the solution path of the iteratively reweighted $\ell_1$-penalized retire estimation, adapted from the local linear approximation algorithm for folded-concave regularization. Under a mild minimum signal strength condition, we show that after as many as $\log(\log d)$ iterations the final iterate enjoys the oracle convergence rate. At each iteration, the weighted $\ell_1$-penalized convex program can be efficiently solved by a semismooth Newton coordinate descent algorithm. Numerical studies demonstrate the competitive performance of the proposed procedure compared with either non-robust or quantile regression based alternatives.

翻译：高维数据常因异方差或协变量效应不均匀而呈现异质性。惩罚分位数回归与期望分位数回归方法为检测高维数据中的异方差性提供了有效工具。前者因检验损失函数的非光滑性而面临计算挑战，后者则对重尾误差分布敏感。本文提出并研究了（惩罚型）稳健期望分位数回归（retire），重点采用迭代加权$\ell_1$正则化方法，该方法能减少$\ell_1$正则化带来的估计偏差并具备Oracle性质。理论上，我们建立了retire估计量在两种情景下的统计性质：(i) 低维情景（$d \ll n$）；(ii) 高维情景（$s\ll n\ll d$，其中$s$表示显著预测变量数量）。在高维设定中，我们精细刻画了迭代加权$\ell_1$惩罚retire估计的求解路径，该方法基于凹折叠正则化的局部线性近似算法。在温和的最小信号强度条件下，我们证明经过$\log(\log d)$次迭代后，最终迭代解即可达到Oracle收敛速度。每次迭代中的加权$\ell_1$惩罚凸规划问题可通过半光滑牛顿坐标下降算法高效求解。数值实验表明，与基于非稳健回归或分位数回归的替代方法相比，本文提出的方法具有竞争优势。

0

相关内容

稳健性

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

89+阅读 · 2021年12月9日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

PaperWeekly

0+阅读 · 2022年5月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

被忽略的Focal Loss变种

被忽略的Focal Loss变种

极市平台

29+阅读 · 2019年4月19日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

数据分析师应该知道的16种回归方法：泊松回归

数据分析师应该知道的16种回归方法：泊松回归

数萃大数据

35+阅读 · 2018年9月13日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

复杂数据下含指标项半参数模型结构的统计推断及应用

国家自然科学基金

0+阅读 · 2014年12月31日

自回归维纳滤波语音增强方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维纵向数据的若干稳健变量选择方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

非一致指数二分与伪轨跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

近似稀疏高维非参与半参模型的Dantzig Selector的研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类随机偏微分方程解的存在唯一性和渐近性质

国家自然科学基金

0+阅读 · 2012年12月31日

带测量误差变量的广义部分线性变系数模型的估计

国家自然科学基金

1+阅读 · 2011年12月31日

一维动力系统的Julia集及其不变子集的维数与熵

国家自然科学基金

0+阅读 · 2009年12月31日

相依变量及广义过程的自正则化极限理论和应用

国家自然科学基金

0+阅读 · 2009年12月31日

Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression

Arxiv

0+阅读 · 2023年5月12日

Smoothed empirical likelihood estimation and automatic variable selection for an expectile high-dimensional model with possibly missing response variable

Smoothed empirical likelihood estimation and automatic variable selection for an expectile high-dimensional model with possibly missing response variable

Arxiv

0+阅读 · 2023年5月12日

Distribution free MMD tests for model selection with estimated parameters

Arxiv

0+阅读 · 2023年5月12日

Sequential model correction for nonlinear inverse problems

Arxiv

0+阅读 · 2023年5月12日

Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

Arxiv

0+阅读 · 2023年5月12日

Stratified Adversarial Robustness with Rejection

Arxiv

0+阅读 · 2023年5月12日

Dropout Regularization in Extended Generalized Linear Models based on Double Exponential Families

Arxiv

0+阅读 · 2023年5月11日

Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

Arxiv

0+阅读 · 2023年5月10日

Bayesian variance change point detection with credible sets

Arxiv

0+阅读 · 2023年5月10日

Ising Models on Dense Regular Graphs

Arxiv

0+阅读 · 2023年5月10日

VIP会员

文章信息

相关主题

估计/估计量

最新内容

从采集到决策：美军视角下的战术情报范式重构

从采集到决策：美军视角下的战术情报范式重构

专知会员服务

1+阅读 · 今天2:42

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

专知会员服务

1+阅读 · 今天2:37

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

专知会员服务

2+阅读 · 今天2:23

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

专知会员服务

5+阅读 · 今天2:21

《履带式无人地面战车技术发展现状》

《履带式无人地面战车技术发展现状》

专知会员服务

2+阅读 · 今天1:46

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

《美国空军B-2“幽灵”隐身轰炸机系统工程案例研究》117页

专知会员服务

5+阅读 · 8月1日

隐身技术前沿综述：物理机理、工程实践与战略展望

隐身技术前沿综述：物理机理、工程实践与战略展望

专知会员服务

4+阅读 · 8月1日

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

《多变海洋环境下无人水面艇与自主水下机器人对接的最优路径规划》

专知会员服务

3+阅读 · 8月1日

《以机反机：基于无人机载麦克风的空中周界入侵检测》

《以机反机：基于无人机载麦克风的空中周界入侵检测》

专知会员服务

4+阅读 · 8月1日

《无人机脆弱性利用：网络空间力量的新域》

《无人机脆弱性利用：网络空间力量的新域》

专知会员服务

2+阅读 · 8月1日

美空军如何将人工智能从战场部署至后方机关

美空军如何将人工智能从战场部署至后方机关

专知会员服务

11+阅读 · 7月31日

《美战争部指令文件：网络空间效应与使能能力测试评估》

《美战争部指令文件：网络空间效应与使能能力测试评估》

专知会员服务

8+阅读 · 7月31日

《史诗怒火行动：多域前瞻评估》49页报告

《史诗怒火行动：多域前瞻评估》49页报告

专知会员服务

7+阅读 · 7月31日

《英国防部：未来空战系统数字化战略》33页

《英国防部：未来空战系统数字化战略》33页

专知会员服务

5+阅读 · 7月31日

《面向自主飞行网络的智能体人工智能架构》

《面向自主飞行网络的智能体人工智能架构》

专知会员服务

7+阅读 · 7月31日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【硬核书】矩阵代数基础，248页pdf

【硬核书】矩阵代数基础，248页pdf

专知会员服务

89+阅读 · 2021年12月9日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

94+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

106+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

乌克兰“德尔塔”系统揭示无人机、数据与领导力如何重塑现代安全格局

《北约概念开发与实验（CD&E）手册：概念开发者工具箱》100页手册

从采集到决策：美军视角下的战术情报范式重构

大规模作战中的参谋流程：作为联合兵种作战组成部分的目标锁定

相关资讯

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

PaperWeekly

0+阅读 · 2022年5月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

被忽略的Focal Loss变种

被忽略的Focal Loss变种

极市平台

29+阅读 · 2019年4月19日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

数据分析师应该知道的16种回归方法：泊松回归

数据分析师应该知道的16种回归方法：泊松回归

数萃大数据

35+阅读 · 2018年9月13日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression

Arxiv

0+阅读 · 2023年5月12日

Smoothed empirical likelihood estimation and automatic variable selection for an expectile high-dimensional model with possibly missing response variable

Smoothed empirical likelihood estimation and automatic variable selection for an expectile high-dimensional model with possibly missing response variable

Arxiv

0+阅读 · 2023年5月12日

Distribution free MMD tests for model selection with estimated parameters

Arxiv

0+阅读 · 2023年5月12日

Sequential model correction for nonlinear inverse problems

Arxiv

0+阅读 · 2023年5月12日

Scalable Bayesian optimization with high-dimensional outputs using randomized prior networks

Arxiv

0+阅读 · 2023年5月12日

Stratified Adversarial Robustness with Rejection

Arxiv

0+阅读 · 2023年5月12日

Dropout Regularization in Extended Generalized Linear Models based on Double Exponential Families

Arxiv

0+阅读 · 2023年5月11日

Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

Arxiv

0+阅读 · 2023年5月10日

Bayesian variance change point detection with credible sets

Arxiv

0+阅读 · 2023年5月10日

Ising Models on Dense Regular Graphs

Arxiv

0+阅读 · 2023年5月10日

相关基金

粗糙回归模型与算法研究

国家自然科学基金

8+阅读 · 2015年12月31日

复杂数据下含指标项半参数模型结构的统计推断及应用

国家自然科学基金

0+阅读 · 2014年12月31日

自回归维纳滤波语音增强方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

高维纵向数据的若干稳健变量选择方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

非一致指数二分与伪轨跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

近似稀疏高维非参与半参模型的Dantzig Selector的研究

国家自然科学基金

0+阅读 · 2012年12月31日

一类随机偏微分方程解的存在唯一性和渐近性质

国家自然科学基金

0+阅读 · 2012年12月31日

带测量误差变量的广义部分线性变系数模型的估计

国家自然科学基金

1+阅读 · 2011年12月31日

一维动力系统的Julia集及其不变子集的维数与熵

国家自然科学基金

0+阅读 · 2009年12月31日

相依变量及广义过程的自正则化极限理论和应用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员