Atmospheric retrievals (AR) of exoplanets typically rely on a combination of a Bayesian inference technique and a forward simulator to estimate atmospheric properties from an observed spectrum. A key component in simulating spectra is the pressure-temperature (PT) profile, which describes the thermal structure of the atmosphere. Current AR pipelines commonly use ad hoc fitting functions here that limit the retrieved PT profiles to simple approximations, but still use a relatively large number of parameters. In this work, we introduce a conceptually new, data-driven parameterization scheme for physically consistent PT profiles that does not require explicit assumptions about the functional form of the PT profiles and uses fewer parameters than existing methods. Our approach consists of a latent variable model (based on a neural network) that learns a distribution over functions (PT profiles). Each profile is represented by a low-dimensional vector that can be used to condition a decoder network that maps $P$ to $T$. When training and evaluating our method on two publicly available datasets of self-consistent PT profiles, we find that our method achieves, on average, better fit quality than existing baseline methods, despite using fewer parameters. In an AR based on existing literature, our model (using two parameters) produces a tighter, more accurate posterior for the PT profile than the five-parameter polynomial baseline, while also speeding up the retrieval by more than a factor of three. By providing parametric access to physically consistent PT profiles, and by reducing the number of parameters required to describe a PT profile (thereby reducing computational cost or freeing resources for additional parameters of interest), our method can help improve AR and thus our understanding of exoplanet atmospheres and their habitability.
翻译:系外行星大气反演通常依赖于贝叶斯推断技术与正向模拟器的组合,通过观测光谱推断大气特性。模拟光谱的关键组成部分是压力-温度(PT)廓线,它描述了大气的热结构。当前的大气反演流程普遍使用特定拟合函数来参数化PT廓线,这不仅将检索限制在简单近似,还需要相对较多的参数。本研究提出一种概念新颖的数据驱动参数化方案,能够生成物理自洽的PT廓线,无需对PT廓线函数形式作出显式假设,且参数数量少于现有方法。该方法构建基于神经网络的潜变量模型来学习PT廓线(函数)的分布,每个廓线由低维向量表示,用于条件化将气压$P$映射至温度$T$的解码器网络。在两个公开的自洽PT廓线数据集上训练和评估后,我们发现该方法在平均拟合质量上优于现有基线方法,同时使用更少参数。基于现有文献的反演实验中,我们的模型(使用两个参数)相较五参数多项式基线,能够生成更紧凑、更精确的PT廓线后验分布,并将反演速度提升三倍以上。该方法通过提供物理自洽PT廓线的参数化访问,并减少描述PT廓线所需的参数数量(从而降低计算成本或释放资源用于其他重要参数),可有效改进大气反演,加深我们对系外行星大气及其宜居性的理解。