With the increase in video-sharing platforms across the internet, it is difficult for humans to moderate the data for explicit content. Hence, an automated pipeline to scan through video data for explicit content has become the need of the hour. We propose a novel pipeline that uses multi-modal deep learning to first extract the explicit segments of input videos and then summarize their content using text to determine its age appropriateness and age rating. We also evaluate our pipeline's effectiveness in the end using standard metrics.
翻译:随着互联网上视频分享平台的增多,人类难以对数据进行显性内容审核。因此,开发一种能够自动扫描视频数据并识别显性内容的流水线已成为当务之急。我们提出了一种新颖的流水线,该流水线利用多模态深度学习,首先提取输入视频中的显性片段,然后通过文本摘要其内容,以确定其年龄适宜性和年龄分级。最后,我们还使用标准指标评估了该流水线的有效性。