89% of B2B demand gen teams evaluate lead vendors on CPL alone. Our 2026 research shows that metric predicts almost nothing about SQL conversion. Here are the four questions that actually matter.


There’s a meeting most demand gen leaders have had more than once. A lead generation vendor presents their CPL, their fill rate, their publisher reach, their targeting capabilities. The numbers look reasonable. The deck is polished. The case studies are cherry-picked but plausible. You sign.

Three months later, your SDR team is working the leads and converting at 8%. The vendor’s dashboard shows a 94% delivery rate and a CPL that’s within budget. By every metric the contract was written around, the program is performing. By the only metric that actually matters to your CRO — pipeline — it isn’t.

This is the vendor evaluation trap. And according to our 2026 B2B Pipeline Trust Report — drawn from 500+ B2B marketing and sales leaders — 89% of organizations evaluate lead generation vendors on CPL alone or CPL plus fill rate. Only 28% include SQL conversion rate in their vendor evaluation criteria.

The consequence is predictable. Organizations that evaluate vendors on downstream conversion metrics see 4.3x higher SQL rates than organizations that evaluate on CPL and fill rate alone. The scorecard you use to select your vendor determines the program you get. Most scorecards are built around the wrong metrics.

This article gives you the four questions that restructure the evaluation entirely — and makes explicit what a good answer looks like versus what a bad answer sounds like.


Why CPL Is the Wrong Primary Metric

CPL measures what a vendor delivers at the moment of handoff. It tells you nothing about what happens to those leads after they arrive in your CRM.

This matters because vendor incentives are shaped by whatever metric the contract is written around. A vendor evaluated and renewed on CPL has every incentive to optimize for volume at a price point — which means casting a wide net, applying minimal qualification, and delivering the maximum number of contacts that technically meet the targeting criteria. The fact that a large percentage of those contacts will never convert to SQL is invisible in a CPL-only evaluation framework.

A vendor evaluated on SQL conversion rate has the opposite incentive structure. Every lead they deliver needs to be genuinely worth an SDR’s time — because if it isn’t, the conversion data makes that visible and the renewal is at risk. This incentive alignment is why the 4.3x SQL rate difference exists. It’s not that SQL-evaluated vendors have access to better leads by nature. It’s that the accountability structure changes what they’re motivated to deliver.

The four questions that follow are designed to surface that incentive structure before you sign — not three months after.


Question 1: What Is Your SQL Conversion Rate for Clients in My Vertical and Company Size — and Can You Show Me the Data?

This is the question most vendors aren’t asked and aren’t prepared to answer with specificity. It’s also the most important one.

Every vendor has case studies. Case studies are selected to tell the best possible story. What you need is not a case study — it’s a benchmark. What does the average SQL conversion rate look like for clients similar to yours, in your vertical, at your company size, with your buyer profile?

What a good answer looks like: The vendor provides a specific number — not a range designed to be unfalsifiable — and can explain how it’s calculated. They define SQL conversion the same way your sales team does: a lead that was worked by an SDR and resulted in a qualified opportunity, not a lead that was “accepted” in the CRM. They differentiate performance by content type, lead qualification tier, and vertical. They’re willing to connect you with a reference client in your category.

What a bad answer sounds like: “Our clients see strong conversion rates across the board.” Vague ranges without vertical context. Conversion rate defined as MQL acceptance rather than SQL conversion. Redirection to case studies instead of benchmark data. Reluctance to provide reference clients in your specific vertical.

The inability to answer this question with specificity is itself diagnostic. A vendor who genuinely delivers pipeline knows what their pipeline looks like by vertical and tier. A vendor who delivers volume and lets you figure out the pipeline doesn’t track that data — because tracking it would make the performance gap visible.


Question 2: How Do You Qualify Leads Before Delivery — and at What Point Does Human Verification Occur?

Lead qualification is the part of the vendor’s process that most directly predicts downstream conversion. It’s also the part most vendors describe in the vaguest possible terms.

There are fundamentally two qualification models in B2B lead generation. The first is system-side qualification — the vendor’s platform applies targeting filters at the point of content distribution, and any contact who meets the firmographic criteria and downloads the content becomes a lead. Verification, if it happens at all, is automated: email validation, phone formatting, bot filtering. The human review step is either minimal or absent.

The second is human-verified qualification — a person reviews each lead record against the campaign criteria before it’s delivered. Phone numbers are called to confirm validity. Custom qualifying questions are reviewed for genuine intent. Contacts that don’t meet the criteria are replaced rather than delivered.

The downstream SQL conversion difference between these two models is significant. In our research, teams using human-verified lead delivery converted at 24–28% MQL-to-SQL versus 6–9% for system-side qualification only. The qualification step that happens before delivery is the single largest predictor of what happens after delivery.

What a good answer looks like: The vendor describes a specific human review process — not just automated validation. They can tell you at what point in the delivery workflow a human reviews each lead, what criteria that review applies, and what happens to leads that don’t pass. They have a named replacement policy: leads that don’t meet the agreed criteria are replaced before the invoice is issued, not credited after the fact.

What a bad answer sounds like: “We use advanced AI verification to ensure lead quality.” A detailed description of email validation and bot filtering with no mention of human review. A replacement policy that requires you to identify and dispute bad leads after delivery rather than preventing their delivery in the first place. The phrase “industry-standard verification” without specifics.

Automated verification and human verification are not the same thing. A vendor that conflates them either doesn’t understand the difference or is hoping you won’t ask.


Question 3: What’s Your Lead Replacement Policy — and How Is It Triggered?

Every lead generation vendor has a replacement or credit policy. The policy itself is less important than when it activates and who controls the trigger.

There are two replacement models. The first requires you to identify a bad lead, submit a dispute, wait for vendor review, and receive either a credit or a replacement on a future delivery cycle. In this model, the burden of quality assurance sits entirely with you. Your SDR team does the work of discovering the problem. The vendor’s exposure is limited to however many disputes you have the capacity to file.

The second model builds replacement into the delivery process itself. Leads are reviewed against the agreed criteria before delivery. Leads that don’t meet the criteria are replaced before the invoice is issued. You pay only for leads you’ve verified meet your standards. The quality assurance burden sits with the vendor, not with your team.

The operational difference matters for two reasons. First, a replacement policy that activates after delivery doesn’t prevent bad leads from consuming SDR time — it compensates you for them after the damage is done. An SDR who spends 20 minutes on a lead that was never going to convert doesn’t recover that time when the lead is credited. Second, a vendor whose replacement policy requires dispute initiation has a structural incentive to make the dispute process friction-heavy, because every dispute they don’t receive is a lead they don’t have to replace.

What a good answer looks like: Replacement happens before invoicing. The policy is proactive rather than dispute-triggered. The vendor takes responsibility for identifying leads that don’t meet criteria rather than waiting for you to flag them. The replacement timeline is specific — not “within a reasonable period.”

What a bad answer sounds like: “We have a 10% invalid lead credit built into every campaign.” A dispute process that requires you to submit each bad lead individually with documentation. Replacement on future delivery cycles rather than before the current invoice. A policy that covers invalid contacts but not leads that don’t meet the qualification criteria you specified.


Question 4: How Do You Measure Program Success — and What Metric Do You Use for Renewal Decisions?

This question is the most revealing one on the list, and it’s almost never asked.

The metric a vendor uses to measure their own program success tells you everything about what they’re optimizing for. A vendor who measures success by leads delivered and CPL is optimizing for volume efficiency. A vendor who measures success by SQL conversion rate and pipeline contribution is optimizing for revenue impact. These are fundamentally different programs, and they produce fundamentally different results.

The renewal decision metric is particularly diagnostic. Whatever metric the vendor tracks for renewal is the metric their entire operation is built around — because renewal is what keeps their business running. If renewals are driven by client satisfaction with CPL and fill rate, the vendor’s incentive is to deliver volume at the agreed price point. If renewals are driven by SQL conversion rate and pipeline contribution, the vendor’s incentive is to deliver leads that actually convert.

In our research, the vendors who performed in the top quartile on SQL conversion rate were uniformly the vendors whose internal success metrics included downstream conversion data. They tracked what happened after delivery — not just what happened at delivery. That tracking changed what they were motivated to optimize.

What a good answer looks like: The vendor tracks SQL conversion rate, pipeline contribution, or cost per SQL as primary success metrics alongside CPL and fill rate. They have a mechanism for receiving conversion data from clients — a CRM integration, a reporting cadence, a formal feedback loop. Their renewal conversations include a review of downstream performance, not just delivery metrics. They can tell you what their average client’s SQL rate looks like over a 12-month program.

What a bad answer sounds like: “Our clients measure success differently, so we track what matters to each client.” Success defined entirely by delivery metrics. No mechanism for tracking what happens after leads are delivered. A renewal process based on client satisfaction scores and CPL rather than pipeline contribution. The absence of any downstream data — not because they haven’t collected it, but because they haven’t tracked it.


The Scorecard That Changes the Conversation

These four questions do something most vendor evaluations don’t: they make the vendor’s incentive structure visible before you’re committed to a program.

A vendor who can answer all four with specificity — SQL conversion benchmarks by vertical, a human verification process with a pre-delivery replacement policy, and a success measurement framework that includes downstream conversion — has built their program around your pipeline, not their volume metrics. That alignment is the structural condition for a program that actually produces revenue.

A vendor who deflects, generalizes, or redirects to case studies when asked these questions hasn’t built that alignment. They’ve built a program optimized for their renewal metrics, which are CPL and fill rate. Your SQL conversion rate is your problem, not theirs.

The 4.3x SQL rate difference in our research isn’t a data anomaly. It’s the direct consequence of the evaluation framework buyers use to select vendors. Change the scorecard and you change the program. Change the program and you change the pipeline.


How LeadSpot Answers These Questions

We built LeadSpot around the answers to all four of these questions — because the research we published is our own, and we’d be building the wrong program if we ignored it.

SQL conversion benchmarks: Our HQL programs produce 24–28% MQL-to-SQL conversion rates across B2B technology, cybersecurity, HR tech, and enterprise software clients. We track this data by vertical and can share benchmarks relevant to your ICP before you commit to a program.

Human verification: Every lead in our HQL and BANT programs is reviewed by a human against your campaign criteria before delivery. Phone numbers are validated. Custom qualifying question responses are reviewed for genuine intent. Contacts that don’t meet the standard are replaced, not delivered.

Replacement policy: We don’t invoice until you’ve verified the leads meet your criteria. Leads that don’t are replaced before the invoice is issued. There’s no dispute process because the quality assurance happens before delivery, not after.

Success metrics: We track SQL conversion rate and pipeline contribution alongside CPL and fill rate for every program. Our renewal conversations include a downstream performance review. We don’t consider a program successful because it delivered leads on time and on budget — we consider it successful when your SDR team is converting at a rate that justifies the investment.

If you’re evaluating lead generation vendors right now and want to run these four questions past our team, that’s exactly what the consultation call is for.

Book a Call with LeadSpot →


This article draws on findings from the 2026 B2B Pipeline Trust Report, LeadSpot’s independent study of 500+ B2B marketing and sales leaders conducted in Q1 2026.