We propose an approach utilizing gamma-distributed random variables, coupled with log-Gaussian modeling, to generate synthetic datasets suitable for training neural networks. This addresses the challenge of limited real observations in various applications. We apply this methodology to both Raman and coherent anti-Stokes Raman scattering (CARS) spectra, using experimental spectra to estimate gamma process parameters. Parameter estimation is performed using Markov chain Monte Carlo methods, yielding a full Bayesian posterior distribution for the model which can be sampled for synthetic data generation. Additionally, we model the additive and multiplicative background functions for Raman and CARS with Gaussian processes. We train two Bayesian neural networks to estimate parameters of the gamma process which can then be used to estimate the underlying Raman spectrum and simultaneously provide uncertainty through the estimation of parameters of a probability distribution. We apply the trained Bayesian neural networks to experimental Raman spectra of phthalocyanine blue, aniline black, naphthol red, and red 264 pigments and also to experimental CARS spectra of adenosine phosphate, fructose, glucose, and sucrose. The results agree with deterministic point estimates for the underlying Raman and CARS spectral signatures.
翻译:我们提出了一种结合伽马分布随机变量与对数高斯建模的方法,用于生成适合训练神经网络的合成数据集。这解决了各类应用中真实观测数据不足的挑战。我们将该方法应用于拉曼光谱和相干反斯托克斯拉曼散射(CARS)光谱,利用实验光谱估计伽马过程参数。参数估计采用马尔可夫链蒙特卡洛方法,从而获得模型完整的贝叶斯后验分布,可通过对该分布采样生成合成数据。此外,我们使用高斯过程对拉曼和CARS光谱中加性及乘性背景函数进行建模。我们训练两个贝叶斯神经网络来估计伽马过程参数,进而用于估计底层拉曼光谱,并同时通过概率分布参数估计提供不确定性。我们将训练好的贝叶斯神经网络应用于酞菁蓝、苯胺黑、萘酚红和红264颜料的实验拉曼光谱,以及磷酸腺苷、果糖、葡萄糖和蔗糖的实验CARS光谱。所得结果与底层拉曼和CARS光谱特征的确定性点估计一致。