The lead marketing ecosystem enables collection, sale, and use of personal data submitted via web forms to deliver personalized quotes in high-value verticals such as insurance. Despite its scale and sensitivity of the collected data, this ecosystem remains largely unexplored by the research community. We present the first empirical study of privacy and spam risks in lead marketing, developing an end-to-end measurement framework to trace data flows from data collection to consumer contact. Our setup instruments over 100 health-related lead-generation websites and monitors 200 controlled phone numbers and email addresses to understand downstream marketing practices. We observe sharing of highly personal and sensitive health information to more than 70 distinct third parties on these lead generation websites. By purchasing our own and other organic leads from three major lead platforms, we uncover deceptive brokerage practices, where consumer data is sold to unvetted buyers and often augmented or fabricated with attributes such as health status and weight. We received a total of over 8,000 telemarketing phone calls, 600 text messages, and 200 emails, where calls often began within seconds of form submission. Many campaigns relied on VoIP-based neighbor spoofing and high-frequency dialing, at times rendering phones unusable. Our experiments with phone and email opt-outs suggest phone-based opt-outs to help the most, although all were ineffective at completely stopping marketing communications. Analysis of 7,432 Better Business Bureau (BBB) complaints and reviews corroborates these findings from the consumer perspective. Overall, our results reveal a highly interconnected and non-compliant lead marketing ecosystem that aggressively monetizes sensitive consumer data.
翻译:线索营销生态系统支持通过网页表单提交的个人数据收集、销售与使用,以便在保险等高价值垂直领域提供个性化报价。尽管该系统规模庞大且所收集数据具有敏感性,但研究界对其仍缺乏深入探索。我们首次对线索营销中的隐私与垃圾信息风险展开实证研究,开发了一套端到端测量框架,用于追踪从数据收集到消费者接触的全链路数据流动。我们的实验装置覆盖100余个健康相关线索生成网站,并通过监控200个受控电话号码与电子邮箱地址,以洞察下游营销实践。我们观察到,在这些线索生成网站上,高度个人化的敏感健康信息被共享给超过70个不同的第三方。通过从三大主要线索平台购买自身及其他自然生成的线索,我们发现了欺骗性的经纪行为:消费者数据被出售给未经审查的买家,且常被附加或编造健康状态、体重等属性。我们总计收到超过8000通电话营销来电、600条短信及200封电子邮件,其中部分来电在表单提交后数秒内即开始。许多营销活动依赖基于VoIP的邻居号码欺骗与高频拨号,有时导致手机无法正常使用。我们对电话与电子邮件退订机制的实验表明,电话退订的效果相对最佳,但所有方式均无法完全终止营销通讯。对7432条商业改进局(BBB)投诉与评价的分析进一步从消费者视角佐证了这些发现。总体而言,我们的研究结果揭示了一个高度互联且不合规的线索营销生态系统,该系统正以激进方式将敏感消费者数据货币化。