Health insurance lead sites sell personal data within seconds of form submission

April 10, 2026 4 min read

Table of Contents

Data leaves the site before the form is submitted
Buying consumer data requires no verification
Calls begin within minutes
Opt-outs reduce volume but do not stop contact

Lead generation websites that offer health insurance quotes collect sensitive personal data and sell it to multiple buyers within seconds of a user clicking submit. A study by researchers at UC Davis, Stanford University, and Maastricht University mapped this process across 105 health insurance lead generation sites and monitored what happened to the data over 60 days.

The researchers created 210 synthetic user profiles, each with a unique phone number and email address, and submitted forms across all 105 sites. They then tracked every inbound call, text, and email those profiles received.

Data leaves the site before the form is submitted

Third-party scripts embedded on most of the sites capture form field input in real time, keystroke by keystroke, using JavaScript event listeners. Two vendors appeared most frequently and both received names, phone numbers, email addresses, and health condition data before users clicked the submit button. A user who abandons a partially completed form still has their data captured and transmitted to these vendors.

A separate data leakage path stems from poor form design. Seventy percent of the sites appended submitted PII directly to page URLs. When tracking scripts from ad networks and analytics providers load on those pages, the PII travels with them via referrer headers.

Across the 105 sites, PII reached 73 distinct third parties.

Buying consumer data requires no verification

The researchers registered as buyers on three lead platforms representing distinct roles in the ecosystem: a direct lead generator, a lead aggregator and exchange, and a lead broker specializing in aged leads. None of the three required documentation confirming a legitimate business purpose, industry licensing, or intended use of the data, even when the records included medical conditions, pregnancy status, and prescription information.

The purchased records often contained fabricated or placeholder values. The direct lead generator sold height and weight fields for all leads despite never collecting those fields on its own form. Around 80% of its leads listed identical values of 65 inches and 175 pounds. The aged lead broker assigned the same height, weight, and marital status to all 200 leads purchased in the study.

Researchers flagged this as a risk for downstream decision-making, since insurance underwriters may use such attributes to calculate premiums or risk scores.

One researcher used the direct lead generator’s zip-code targeting feature to purchase their own test profile’s data in real-time for four dollars.

Calls begin within minutes

Across 105 synthetic profiles in the main study, researchers recorded 8,214 inbound calls from 1,240 distinct phone numbers. Seventy-eight percent of profiles received at least one call. Half of all first calls arrived within two minutes of form submission, and four-fifths arrived within 24 minutes.

More than 80% of calls originated from VoIP infrastructure. Traditional mobile carriers accounted for fewer than 16 calls each across the entire study period.

Caller ID analysis showed 59% of calls used an area code matching the recipient’s number, a technique known as neighbor spoofing that is designed to increase answer rates. Some profiles received calls from numbers matching the first six or eight digits of their own numbers.

Individual profiles received as many as 1,676 calls over 60 days, the highest volume recorded for any single profile across the study. Separately, one profile accumulated 251 minutes of cumulative ringing time over the 60-day window, enough to render a phone unusable during active calling periods. In Florida, 22% of caller-receiver pairs exceeded the state’s three-calls-per-24-hour limit for telemarketing.

Only 14% of SMS messages sent to study profiles included opt-out language, a requirement that takes effect under updated FCC rules in April 2026.

Opt-outs reduce volume but do not stop contact

Researchers divided profiles into three groups: phone-based opt-outs, email-based opt-outs, and a control group with no opt-out. Phone-based opt-outs produced a statistically significant decline in call volume over time. Email-based opt-outs and the control group showed no meaningful trend for calls, though email opt-outs did produce a modest reduction in SMS volume.

Email opt-outs showed a similar pattern for email. Some email volume declined after opt-out requests, but the control group’s email volume also declined at a comparable rate, indicating the reduction tracks natural lead aging rather than compliance. Emails continued arriving more than 10 days after opt-out requests were submitted, which puts senders in violation of the CAN-SPAM Act’s 10-day cessation requirement.

Researchers attributed the persistence of contact to the structure of the lead marketplace itself. Once a lead is sold, it may be resold to additional buyers who have no visibility into opt-out signals generated upstream. Callers rotated through new numbers after blocks were applied, making number-level blocking ineffective.

Analysis of 7,432 BBB complaints associated with the study sites corroborated the experimental findings. Twenty percent of consumer reports described receiving more than 14 calls per hour. Complaints frequently cited continued contact after DNC registration, repeated verbal removal requests, and non-functional unsubscribe links.