The Gaia Data Release 3 (DR3), published in June 2022, delivers a diverse set of astrometric, photometric, and spectroscopic measurements for more than a billion stars. The wealth and complexity of the data makes traditional approaches for estimating stellar parameters for the full Gaia dataset almost prohibitive. We have explored different supervised learning methods for extracting basic stellar parameters as well as distances and line-of-sight extinctions, given spectro-photo-astrometric data (including also the new Gaia XP spectra). For training we use an enhanced high-quality dataset compiled from Gaia DR3 and ground-based spectroscopic survey data covering the whole sky and all Galactic components. We show that even with a simple neural-network architecture or tree-based algorithm (and in the absence of Gaia XP spectra), we succeed in predicting competitive results (compared to Bayesian isochrone fitting) down to faint magnitudes. We will present a new Gaia DR3 stellar-parameter catalogue obtained using the currently best-performing machine-learning algorithm for tabular data, XGBoost, in the near future.
翻译:盖亚数据第三次发布(DR3,2022年6月发布)提供了超过十亿颗恒星的多种天体测量、测光和光谱数据。这些数据的丰富性和复杂性使得传统方法几乎无法为完整的Gaia数据集估算恒星参数。我们探索了多种监督学习方法,用于从光谱-测光-天体测量数据(包括新的Gaia XP光谱)中提取基本恒星参数、距离以及视线消光。训练数据采用由Gaia DR3和覆盖全天及银河系各组分的巡天光谱数据编译的高质量增强数据集。研究表明,即使使用简单的神经网络架构或基于树的算法(且不使用Gaia XP光谱),我们仍能在暗弱星等条件下获得与贝叶斯等时线拟合相媲美的预测结果。未来我们将发布基于当前表格数据最佳机器学习算法XGBoost生成的Gaia DR3恒星参数目录。