Deep learning provides powerful methods to impute structured information from large-scale, unstructured text and image datasets. For example, economists might wish to detect the presence of economic activity in satellite images, or to measure the topics or entities mentioned in social media, the congressional record, or firm filings. This review introduces deep neural networks, covering methods such as classifiers, regression models, generative AI, and embedding models. Applications include classification, document digitization, record linkage, and methods for data exploration in massive scale text and image corpora. When suitable methods are used, deep learning models can be cheap to tune and can scale affordably to problems involving millions or billions of data points.. The review is accompanied by a companion website, EconDL, with user-friendly demo notebooks, software resources, and a knowledge base that provides technical details and additional applications.
翻译:深度学习提供了从大规模非结构化文本和图像数据集中提取结构化信息的强大方法。例如,经济学家可能希望从卫星图像中检测经济活动迹象,或衡量社交媒体、国会记录或公司文件中提及的主题或实体。本综述介绍了深度神经网络,涵盖分类器、回归模型、生成式人工智能和嵌入模型等方法。应用领域包括分类、文档数字化、记录链接以及海量文本和图像语料库中的数据探索方法。当采用合适的方法时,深度学习模型的调优成本较低,且能够经济高效地扩展到涉及数百万乃至数十亿数据点的问题。本综述附有一个配套网站EconDL,提供用户友好的演示笔记本、软件资源以及包含技术细节和额外应用的知识库。