视频大语言模型论文 - 专知

会员服务 ·

视频大语言模型

视频大语言模型

Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

Arxiv

0+阅读 · 6月14日

OmniMem: Perturbation-aware Memory Compression for Streaming Audio-Visual LLMs

Arxiv

0+阅读 · 5月26日

STEAR: Layer-Aware Spatiotemporal Evidence Intervention for Hallucination Mitigation in Video Large Language Models

Arxiv

0+阅读 · 4月3日

V-CAST: Video Curvature-Aware Spatio-Temporal Pruning for Efficient Video Large Language Models

Arxiv

0+阅读 · 3月29日

A Benchmarking Methodology to Assess Open-Source Video Large Language Models in Automatic Captioning of News Videos

Arxiv

0+阅读 · 3月29日

Tango: Taming Visual Signals for Efficient Video Large Language Models

Arxiv

0+阅读 · 4月10日

Sparrow: Text-Anchored Window Attention with Visual-Semantic Glimpsing for Speculative Decoding in Video LLMs

Arxiv

0+阅读 · 2月17日

参考链接

微信扫码咨询专知VIP会员