In the ever-evolving landscape of social network advertising, the volume and accuracy of data play a critical role in the performance of predictive models. However, the development of robust predictive algorithms is often hampered by the limited size and potential bias present in real-world datasets. This study presents and explores a generative augmentation framework of social network advertising data. Our framework explores three generative models for data augmentation - Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Gaussian Mixture Models (GMMs) - to enrich data availability and diversity in the context of social network advertising analytics effectiveness. By performing synthetic extensions of the feature space, we find that through data augmentation, the performance of various classifiers has been quantitatively improved. Furthermore, we compare the relative performance gains brought by each data augmentation technique, providing insights for practitioners to select appropriate techniques to enhance model performance. This paper contributes to the literature by showing that synthetic data augmentation alleviates the limitations imposed by small or imbalanced datasets in the field of social network advertising. At the same time, this article also provides a comparative perspective on the practicality of different data augmentation methods, thereby guiding practitioners to choose appropriate techniques to enhance model performance.
翻译:在社交网络广告不断发展的背景下,数据的规模和准确性对预测模型的性能起着关键作用。然而,鲁棒预测算法的开发常因真实世界数据集的规模有限和潜在偏差而受到阻碍。本研究提出并探索了一种针对社交网络广告数据的生成式增强框架。该框架研究了三种用于数据增强的生成模型——生成对抗网络(GANs)、变分自编码器(VAEs)和高斯混合模型(GMMs)——以丰富社交网络广告分析有效性背景下的数据可用性与多样性。通过对特征空间进行合成扩展,我们发现数据增强在定量上提升了多种分类器的性能。此外,我们比较了每种数据增强技术带来的相对性能增益,为从业者选择合适技术以提升模型性能提供了见解。本文通过证明合成数据增强缓解了社交网络广告领域中小规模或不平衡数据集带来的限制,为文献做出了贡献。同时,本文还提供了不同数据增强方法实用性的比较视角,从而指导从业者选择恰当技术以增强模型性能。