Revisiting Shadow Detection from a Vision-Language Perspective - 专知论文

会员服务 ·

0

相似度 · Vision · 稳健性 · MoDELS · 监督 ·

Revisiting Shadow Detection from a Vision-Language Perspective

翻译：暂无翻译

Yonghui Wang,Shaokai Liu,Wengang Zhou,Hao Feng,Houqiang Li

Shadow detection is commonly formulated as a vision-driven dense prediction problem, where models rely primarily on pixel-wise visual supervision to distinguish shadows from non-shadow regions. However, this formulation can become unreliable in visually ambiguous cases, where similar dark regions may correspond either to cast shadows or to intrinsically dark surfaces, making visual evidence alone insufficient for establishing a stable decision rule. In this work, we revisit shadow detection from a vision--language perspective and argue that robust prediction benefits from an explicit semantic reference beyond visual cues alone. We propose SVL, a Shadow Vision--Language framework that uses language as an explicit semantic reference to disambiguate shadows from visually similar dark regions. SVL aligns global image representations with shadow-related text embeddings through scene-level shadow ratio regression, and transfers this semantic guidance to dense prediction via global-to-local coupling and local patch-level constraints. Built on a frozen DINOv3 image encoder, SVL learns only lightweight projection and decoding modules, yielding a parameter-efficient design with less than $1\%$ trainable parameters. Extensive experiments on multiple shadow detection benchmarks, including dedicated hard-case evaluations, suggest strong overall performance and improved robustness under visually ambiguous conditions. Code is available at https://github.com/harrytea/SVL.

翻译：暂无翻译

0

相关内容

相似度

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

专知会员服务

10+阅读 · 2022年3月12日

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

专知会员服务

26+阅读 · 2020年9月29日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

专知会员服务

29+阅读 · 2020年5月19日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【综述论文推荐】无人机计算机视觉：过去、现在与未来，Vision Meets Drones: Past, Present and Future

【综述论文推荐】无人机计算机视觉：过去、现在与未来，Vision Meets Drones: Past, Present and Future

专知会员服务

44+阅读 · 2020年1月20日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

专知会员服务

10+阅读 · 2019年10月30日

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

专知

12+阅读 · 2020年9月30日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知

26+阅读 · 2020年4月3日

learn to see in the dark-低照度图像增强算法

learn to see in the dark-低照度图像增强算法

计算机视觉life

16+阅读 · 2019年1月14日

【论文推荐】最新五篇视觉问答相关论文—深度学习评价、交互注意融合、VizWiz、引导注意力、

【论文推荐】最新五篇视觉问答相关论文—深度学习评价、交互注意融合、VizWiz、引导注意力、

专知

10+阅读 · 2018年6月8日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Relation Networks for Object Detection 论文笔记

Relation Networks for Object Detection 论文笔记

统计学习与视觉计算组

16+阅读 · 2018年4月18日

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

专知

19+阅读 · 2018年3月16日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

原创 | Attention Modeling for Targeted Sentiment

原创 | Attention Modeling for Targeted Sentiment

黑龙江大学自然语言处理实验室

25+阅读 · 2017年11月5日

图像认知中的遮挡影响分析及建模

国家自然科学基金

0+阅读 · 2017年12月31日

面向遮挡条件下的人脸识别方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于LED自适应照明优化的可见光通信网多域耦合传输技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

应用于自动驾驶车辆环境感知系统的去雾技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

全景聚焦合成孔径成像及其遮挡目标提取研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于生物视觉启发特征和遮挡模型的复杂道路环境目标检测方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于深度学习的高分辨率PolSAR影像暗目标判别

国家自然科学基金

3+阅读 · 2015年12月31日

基于声光光谱成像的反激光窃听告警系统关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

深度学习框架下基于情境线索的视觉注意研究

国家自然科学基金

2+阅读 · 2015年12月31日

场景深度关系下的视频遮挡目标检测

国家自然科学基金

1+阅读 · 2015年12月31日

Improving Reasoning in Vision-Language Models via Perception Verified Self-Training

Arxiv

0+阅读 · 6月20日

Do Activation Monitors Survive Model Updates? Benchmarking, Predicting, and Repairing Activation-Monitor Staleness

Arxiv

0+阅读 · 6月18日

Relighting as a Probe of Visual Priors via Augmented Latent Intrinsics

Arxiv

0+阅读 · 6月18日

See-and-Reach: Precise Vision-Language Navigation for UAVs within the Field of View

Arxiv

0+阅读 · 6月18日

Occ-VLM: Occupancy Grounded Vision Language Model for Indoor Scene Understanding

Arxiv

0+阅读 · 6月18日

SAFE-Cascade: Cost-Adaptive Vision-Language Routing for Chart Question Answering

Arxiv

0+阅读 · 6月17日

Language-Instructed Vision Embeddings for Controllable and Generalizable Perception

Arxiv

0+阅读 · 6月17日

From Bounding Boxes to Visual Reasoning: An On-Policy Data Annotation Tool for Vision-Language Models

Arxiv

0+阅读 · 6月17日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Good Features to Correlate for Visual Tracking

Arxiv

10+阅读 · 2018年3月10日

VIP会员

文章信息

相关主题

最新内容

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

2+阅读 · 今天11:43

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

2+阅读 · 今天11:41

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

5+阅读 · 今天6:30

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

5+阅读 · 今天6:18

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

6+阅读 · 今天6:08

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

6+阅读 · 今天5:54

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

7+阅读 · 今天5:22

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

7+阅读 · 今天5:15

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

7+阅读 · 今天3:42

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

5+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

7+阅读 · 6月24日

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

10+阅读 · 6月24日

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

9+阅读 · 6月24日

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

7+阅读 · 6月24日

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

9+阅读 · 6月24日

相关VIP内容

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

【CVPR 2022】深度安全多视图聚类:降低因视图增加而导致聚类性能下降的风险，Deep Safe Multi-view Clustering: Reducing the Risk of Clustering Performance Degradation Caused by View Increase

专知会员服务

10+阅读 · 2022年3月12日

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

专知会员服务

26+阅读 · 2020年9月29日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

语言视觉预训练语言模型揭密，Behind the Scene: Revealing the Secrets of Pre-trained Vision-and-Language Models

专知会员服务

36+阅读 · 2020年5月20日

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

【SIGGRAPH 2020】人像阴影处理，Portrait Shadow Manipulation

专知会员服务

29+阅读 · 2020年5月19日

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

【CVPR2020-Uber】物理上可实现的对抗性的例子，用于激光雷达的目标检测，Physically Realizable Adversarial Examples for LiDAR Object Detection

专知会员服务

22+阅读 · 2020年4月16日

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

【CVPR2020】强化特征点，Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

专知会员服务

49+阅读 · 2020年2月25日

【综述论文推荐】无人机计算机视觉：过去、现在与未来，Vision Meets Drones: Past, Present and Future

【综述论文推荐】无人机计算机视觉：过去、现在与未来，Vision Meets Drones: Past, Present and Future

专知会员服务

44+阅读 · 2020年1月20日

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

【论文推荐】不同图像域弱监督语义分割的综合分析，A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains

专知会员服务

28+阅读 · 2019年12月27日

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

Understanding Color and the In-Camera Image Processing Pipeline for Computer Vision 【Michael S. Brown IEEE】韩国 ICCV 2019

专知会员服务

10+阅读 · 2019年10月30日

热门VIP内容

开通专知VIP会员享更多权益服务

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

网状网络及其在军事领域的运用

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

相关资讯

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

专知

12+阅读 · 2020年9月30日

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

【CVPR2020-台大】透视眼：学会透过障碍物看东西，Learning to See Through Obstructions

专知

26+阅读 · 2020年4月3日

learn to see in the dark-低照度图像增强算法

learn to see in the dark-低照度图像增强算法

计算机视觉life

16+阅读 · 2019年1月14日

【论文推荐】最新五篇视觉问答相关论文—深度学习评价、交互注意融合、VizWiz、引导注意力、

【论文推荐】最新五篇视觉问答相关论文—深度学习评价、交互注意融合、VizWiz、引导注意力、

专知

10+阅读 · 2018年6月8日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Relation Networks for Object Detection 论文笔记

Relation Networks for Object Detection 论文笔记

统计学习与视觉计算组

16+阅读 · 2018年4月18日

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

【论文推荐】最新6篇目标检测相关论文—场景文本检测、显著对象、语义知识转移、混合监督目标检测、域自适应、车牌识别

专知

19+阅读 · 2018年3月16日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

原创 | Attention Modeling for Targeted Sentiment

原创 | Attention Modeling for Targeted Sentiment

黑龙江大学自然语言处理实验室

25+阅读 · 2017年11月5日

相关论文

Improving Reasoning in Vision-Language Models via Perception Verified Self-Training

Arxiv

0+阅读 · 6月20日

Do Activation Monitors Survive Model Updates? Benchmarking, Predicting, and Repairing Activation-Monitor Staleness

Arxiv

0+阅读 · 6月18日

Relighting as a Probe of Visual Priors via Augmented Latent Intrinsics

Arxiv

0+阅读 · 6月18日

See-and-Reach: Precise Vision-Language Navigation for UAVs within the Field of View

Arxiv

0+阅读 · 6月18日

Occ-VLM: Occupancy Grounded Vision Language Model for Indoor Scene Understanding

Arxiv

0+阅读 · 6月18日

SAFE-Cascade: Cost-Adaptive Vision-Language Routing for Chart Question Answering

Arxiv

0+阅读 · 6月17日

Language-Instructed Vision Embeddings for Controllable and Generalizable Perception

Arxiv

0+阅读 · 6月17日

From Bounding Boxes to Visual Reasoning: An On-Policy Data Annotation Tool for Vision-Language Models

Arxiv

0+阅读 · 6月17日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

Good Features to Correlate for Visual Tracking

Arxiv

10+阅读 · 2018年3月10日

相关基金

图像认知中的遮挡影响分析及建模

国家自然科学基金

0+阅读 · 2017年12月31日

面向遮挡条件下的人脸识别方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于LED自适应照明优化的可见光通信网多域耦合传输技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

应用于自动驾驶车辆环境感知系统的去雾技术研究

国家自然科学基金

1+阅读 · 2015年12月31日

全景聚焦合成孔径成像及其遮挡目标提取研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于生物视觉启发特征和遮挡模型的复杂道路环境目标检测方法研究

国家自然科学基金

4+阅读 · 2015年12月31日

基于深度学习的高分辨率PolSAR影像暗目标判别

国家自然科学基金

3+阅读 · 2015年12月31日

基于声光光谱成像的反激光窃听告警系统关键技术研究

国家自然科学基金

0+阅读 · 2015年12月31日

深度学习框架下基于情境线索的视觉注意研究

国家自然科学基金

2+阅读 · 2015年12月31日

场景深度关系下的视频遮挡目标检测

国家自然科学基金

1+阅读 · 2015年12月31日

微信扫码咨询专知VIP会员