Motion Expression guided Video Segmentation (MeViS), as an emerging task, poses many new challenges to the field of referring video object segmentation (RVOS). In this technical report, we investigated and validated the effectiveness of static-dominant data and frame sampling on this challenging setting. Our solution achieves a J&F score of 0.5447 in the competition phase and ranks 1st in the MeViS track of the PVUW Challenge. The code is available at: https://github.com/Tapall-AI/MeViS_Track_Solution_2024.
翻译:运动表达引导的视频分割(MeViS)作为一个新兴任务,为指代视频目标分割(RVOS)领域带来了诸多新挑战。在本技术报告中,我们研究并验证了静态主导数据与帧采样在这一挑战性设定下的有效性。我们的解决方案在竞赛阶段取得了0.5447的J&F分数,并在PVUW挑战赛的MeViS赛道中排名第一。代码发布于:https://github.com/Tapall-AI/MeViS_Track_Solution_2024。