Large-scale video repositories are increasingly available for modern video understanding and generation tasks. However, transforming raw videos into high-quality, task-specific datasets remains costly and inefficient. We present DataCube, an intelligent platform for automatic video processing, multi-dimensional profiling, and query-driven retrieval. DataCube constructs structured semantic representations of video clips and supports hybrid retrieval with neural re-ranking and deep semantic matching. Through an interactive web interface, users can efficiently construct customized video subsets from massive repositories for training, analysis, and evaluation, and build searchable systems over their own private video collections. The system is publicly accessible at https://datacube.baai.ac.cn/. Demo Video: https://baai-data-cube.ks3-cn-beijing.ksyuncs.com/custom/Adobe%20Express%20-%202%E6%9C%8818%E6%97%A5%20%281%29%281%29%20%281%29.mp4
翻译:大规模视频库在现代视频理解与生成任务中日益普及。然而,将原始视频转化为高质量、任务特定的数据集仍然成本高昂且效率低下。本文提出DataCube,一个用于自动视频处理、多维度剖析与查询驱动检索的智能平台。DataCube构建视频片段的结构化语义表示,并支持通过神经重排序与深度语义匹配的混合检索。通过交互式网页界面,用户能够高效地从海量视频库中构建用于训练、分析与评估的定制化视频子集,并可在私有视频集合上构建可检索系统。本系统公开访问地址为 https://datacube.baai.ac.cn/。演示视频:https://baai-data-cube.ks3-cn-beijing.ksyuncs.com/custom/Adobe%20Express%20-%202%E6%9C%8818%E6%97%A5%20%281%29%281%29%20%281%29.mp4