Now You See Me: Robust approach to Partial Occlusions

Occlusions of objects is one of the indispensable problems in Computer vision. While Convolutional Neural Net-works (CNNs) provide various state of the art approaches for regular image classification, they however, prove to be not as effective for the classification of images with partial occlusions. Partial occlusion is scenario where an object is occluded partially by some other object/space. This problem when solved,holds tremendous potential to facilitate various scenarios. We in particular are interested in autonomous driving scenario and its implications in the same. Autonomous vehicle research is one of the hot topics of this decade, there are ample situations of partial occlusions of a driving sign or a person or other objects at different angles. Considering its prime importance in situations which can be further extended to video analytics of traffic data to handle crimes, anticipate income levels of various groups etc.,this holds the potential to be exploited in many ways. In this paper, we introduce our own synthetically created dataset by utilising Stanford Car Dataset and adding occlusions of various sizes and nature to it. On this created dataset, we conducted a comprehensive analysis using various state of the art CNN models such as VGG-19, ResNet 50/101, GoogleNet, DenseNet 121. We further in depth study the effect of varying occlusion proportions and nature on the performance of these models by fine tuning and training these from scratch on dataset and how is it likely to perform when trained in different scenarios, i.e., performance when training with occluded images and unoccluded images, which model is more robust to partial occlusions and soon.

翻译：物体遮挡是计算机视觉中不可或缺的难题之一。虽然卷积神经网络（CNNs）为常规图像分类提供了多种最先进的方法，但对于部分遮挡图像的分类，其效果却不尽如人意。部分遮挡是指物体被其他物体/空间部分遮挡的情况。解决这一问题，将极大促进多种应用场景的发展。我们特别关注自动驾驶场景及其相关影响。自动驾驶车辆研究是当前十年的热点课题之一，驾驶标志、行人或其他物体在不同角度下常出现部分遮挡情况。鉴于其在可进一步扩展至交通数据视频分析（如用于犯罪处理、预测不同群体收入水平等）场景中的重要性，该问题具有广泛的应用潜力。本文利用斯坦福汽车数据集，通过添加不同大小和性质的遮挡，构建了合成数据集。基于该数据集，我们使用多种最先进的CNN模型（如VGG-19、ResNet 50/101、GoogleNet、DenseNet 121）进行了全面分析。我们进一步深入研究了不同遮挡比例和性质对模型性能的影响，通过微调或从头训练模型，探讨其在不同训练场景（如分别用遮挡图像与无遮挡图像训练）下的表现，并评估哪些模型对部分遮挡更具鲁棒性。