SENSE: Self-Supervised Neural Embeddings for Spatial Ensembles

Analyzing and visualizing scientific ensemble datasets with high dimensionality and complexity poses significant challenges. Dimensionality reduction techniques and autoencoders are powerful tools for extracting features, but they often struggle with such high-dimensional data. This paper presents an enhanced autoencoder framework that incorporates a clustering loss, based on the soft silhouette score, alongside a contrastive loss to improve the visualization and interpretability of ensemble datasets. First, EfficientNetV2 is used to generate pseudo-labels for the unlabeled portions of the scientific ensemble datasets. By jointly optimizing the reconstruction, clustering, and contrastive objectives, our method encourages similar data points to group together while separating distinct clusters in the latent space. UMAP is subsequently applied to this latent representation to produce 2D projections, which are evaluated using the silhouette score. Multiple types of autoencoders are evaluated and compared based on their ability to extract meaningful features. Experiments on two scientific ensemble datasets - channel structures in soil derived from Markov chain Monte Carlo, and droplet-on-film impact dynamics - show that models incorporating clustering or contrastive loss marginally outperform the baseline approaches.

翻译：分析和可视化具有高维度和复杂性的科学集合数据集面临重大挑战。降维技术和自编码器是提取特征的有力工具，但它们通常难以处理此类高维数据。本文提出一种增强的自编码器框架，该框架结合了基于软轮廓分数的聚类损失以及对比损失，以提升集合数据集的可视化与可解释性。首先，使用EfficientNetV2为科学集合数据集的未标记部分生成伪标签。通过联合优化重构、聚类和对比目标，我们的方法促使相似数据点在潜在空间中聚集，同时分离不同的簇。随后对潜在表示应用UMAP以生成二维投影，并使用轮廓分数进行评估。基于提取有意义特征的能力，对多种类型的自编码器进行了评估和比较。在两个科学集合数据集上的实验——源自马尔可夫链蒙特卡洛的土壤通道结构，以及液滴在薄膜上的撞击动力学——表明，结合聚类或对比损失的模型略微优于基线方法。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

视觉自回归模型综述

专知会员服务

45+阅读 · 2024年11月15日

视觉如何模型统一？牛津大学Shuyang Sun博士论文《迈向统一视觉感知》全面阐述

专知会员服务

47+阅读 · 2024年8月11日

【CVPR2024】GroupContrast：语义感知的自监督表示学习用于三维理解

专知会员服务

18+阅读 · 2024年3月15日

【博士论文】无监督深度图聚类中的自适应表示学习，144页pdf

专知会员服务

43+阅读 · 2023年10月21日