FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation - 专知论文

会员服务 ·

0

估计/估计量 · INFORMS · 标注 · HTTPS · Performance ·

2023 年 5 月 5 日

FlowText: Synthesizing Realistic Scene Text Video with Optical Flow Estimation

翻译：FlowText：利用光流估计合成逼真场景文字视频

Yuzhong Zhao,Weijia Wu,Zhuang Li,Jiahong Li,Weiqiang Wang

Current video text spotting methods can achieve preferable performance, powered with sufficient labeled training data. However, labeling data manually is time-consuming and labor-intensive. To overcome this, using low-cost synthetic data is a promising alternative. This paper introduces a novel video text synthesis technique called FlowText, which utilizes optical flow estimation to synthesize a large amount of text video data at a low cost for training robust video text spotters. Unlike existing methods that focus on image-level synthesis, FlowText concentrates on synthesizing temporal information of text instances across consecutive frames using optical flow. This temporal information is crucial for accurately tracking and spotting text in video sequences, including text movement, distortion, appearance, disappearance, shelter, and blur. Experiments show that combining general detectors like TransDETR with the proposed FlowText produces remarkable results on various datasets, such as ICDAR2015video and ICDAR2013video. Code is available at https://github.com/callsys/FlowText.

翻译：当前的视频文字检测方法在充足标注训练数据的支持下已取得优异表现。然而，人工标注数据耗时耗力。为克服这一难题，利用低成本的合成数据成为一种有前景的替代方案。本文提出一种新颖的视频文字合成技术FlowText，它通过光流估计以低成本合成大量文字视频数据，用于训练鲁棒的视频文字检测器。与现有侧重于图像级合成的方法不同，FlowText专注于利用光流合成连续帧间文字实例的时序信息。这种时序信息对于精准跟踪和检测视频序列中的文字至关重要，包括文字运动、形变、出现、消失、遮挡及模糊等动态特性。实验表明，将TransDETR等通用检测器与所提出的FlowText相结合，在ICDAR2015video和ICDAR2013video等多个数据集上取得了显著效果。相关代码已开源至https://github.com/callsys/FlowText。

0

相关内容

估计/估计量

估计/估计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

专知会员服务

63+阅读 · 2021年7月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

111+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

泡泡机器人SLAM

29+阅读 · 2018年10月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

大面积全场离面位移高分辨测试技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

白桦FT及SOC1基因的RNAi研究

国家自然科学基金

0+阅读 · 2009年12月31日

羧甲基壳聚糖对骨关节炎软骨退变的作用及分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

新型高效一维复合发光材料的静电纺丝制备与发光性质

国家自然科学基金

0+阅读 · 2008年12月31日

Exploiting Multimodal Synthetic Data for Egocentric Human-Object Interaction Detection in an Industrial Scenario

Arxiv

0+阅读 · 2023年6月21日

Temporal Conditioning Spiking Latent Variable Models of the Neural Response to Natural Visual Scenes

Arxiv

0+阅读 · 2023年6月21日

Spatiotemporal Pyramidal CNN with Depth-Wise Separable Convolution for Eye Blinking Detection in the Wild

Arxiv

0+阅读 · 2023年6月20日

Real-Time Simultaneous Localization and Mapping with LiDAR intensity

Arxiv

0+阅读 · 2023年6月19日

EVOPOSE: A Recursive Transformer For 3D Human Pose Estimation With Kinematic Structure Priors

Arxiv

0+阅读 · 2023年6月16日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

估计/估计量

最新内容

印度精确打击与指挥架构的断层

印度精确打击与指挥架构的断层

专知会员服务

4+阅读 · 7月20日

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

专知会员服务

5+阅读 · 7月20日

美空军AI完成F-16战斗机自主空战历史性试飞

美空军AI完成F-16战斗机自主空战历史性试飞

专知会员服务

5+阅读 · 7月20日

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

专知会员服务

5+阅读 · 7月20日

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

《美国陆军：通过弹性分布式模型库实现自适应AI优势》

专知会员服务

3+阅读 · 7月20日

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

博士论文 | 理解与改进大语言模型推理：从反转诅咒到连续思维链

专知会员服务

5+阅读 · 7月20日

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

综述 | 终身视觉表征：持续自监督学习CSSL系统综述

专知会员服务

5+阅读 · 7月20日

深入Project Maven：为何人工智能在战场上依然失灵

深入Project Maven：为何人工智能在战场上依然失灵

专知会员服务

14+阅读 · 7月19日

锻造未来士兵：外骨骼、基因工程与赛博格

锻造未来士兵：外骨骼、基因工程与赛博格

专知会员服务

7+阅读 · 7月19日

《无人机系统（UAS）通信网状网络试验性部署》50页报告

《无人机系统（UAS）通信网状网络试验性部署》50页报告

专知会员服务

7+阅读 · 7月19日

《无人机蜂群通信技术研究》50页

《无人机蜂群通信技术研究》50页

专知会员服务

8+阅读 · 7月19日

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

12+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

8+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

13+阅读 · 7月18日

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

10+阅读 · 7月18日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

20篇「ICCV2021 Oral」最新论文抢先看！看当下计算机视觉在研究什么？

专知会员服务

63+阅读 · 2021年7月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

CVPR 2020 论文开源项目合集

专知会员服务

111+阅读 · 2020年3月12日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《NASA喷气推进实验室：高耐久轻质常驻空观测系统（HELIOS）》429页

《美政府问责局——武器系统年度评估（2026年）：强制要求成熟技术或可推动转向快速交付》249页

印度精确打击与指挥架构的断层

美空军AI完成F-16战斗机自主空战历史性试飞

相关资讯

ICCV 2019 行为识别/视频理解论文汇总

ICCV 2019 行为识别/视频理解论文汇总

极市平台

15+阅读 · 2019年9月26日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

【泡泡前沿追踪】跟踪SLAM前沿动态系列之IROS2018

泡泡机器人SLAM

29+阅读 · 2018年10月28日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

相关论文

Exploiting Multimodal Synthetic Data for Egocentric Human-Object Interaction Detection in an Industrial Scenario

Arxiv

0+阅读 · 2023年6月21日

Temporal Conditioning Spiking Latent Variable Models of the Neural Response to Natural Visual Scenes

Arxiv

0+阅读 · 2023年6月21日

Spatiotemporal Pyramidal CNN with Depth-Wise Separable Convolution for Eye Blinking Detection in the Wild

Arxiv

0+阅读 · 2023年6月20日

Real-Time Simultaneous Localization and Mapping with LiDAR intensity

Arxiv

0+阅读 · 2023年6月19日

EVOPOSE: A Recursive Transformer For 3D Human Pose Estimation With Kinematic Structure Priors

Arxiv

0+阅读 · 2023年6月16日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Deep Learning-Based Human Pose Estimation: A Survey

Arxiv

27+阅读 · 2020年12月24日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

Scene Text Detection and Recognition: The Deep Learning Era

Scene Text Detection and Recognition: The Deep Learning Era

Arxiv

27+阅读 · 2019年9月5日

Mobile Video Object Detection with Temporally-Aware Feature Maps

Arxiv

11+阅读 · 2018年3月28日

相关基金

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

大面积全场离面位移高分辨测试技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

白桦FT及SOC1基因的RNAi研究

国家自然科学基金

0+阅读 · 2009年12月31日

羧甲基壳聚糖对骨关节炎软骨退变的作用及分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

新型高效一维复合发光材料的静电纺丝制备与发光性质

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员