The pattern, in one sentence
Build N narrow probes. Then build one endpoint on top that calls all N and returns a score.
The narrow probes stay useful. Humans want them. Specialized agents want them. But the default path through the cluster is the aggregator, and that's where most of the volume lands.
Why agents skip the sub-probes
Each x402 call has overhead. Even when each probe costs a tenth of a cent, the HTTP round-trip and onchain settlement add up. Six probes means six of those. An agent routing a task through production-readiness-score doesn't want to chain uptime-check, error-rate-check, latency-percentile, tls-grade, dns-health, and cert-expiry itself. It wants the answer.
So it picks the endpoint that already did the chaining.
Our call data on prooflayer is lopsided. production-readiness-score gets roughly 7 calls for every 1 call to any individual sub-probe. The composite is what agents actually want. The probes are what builders want when they're debugging a specific axis. Both audiences matter, but they don't have the same volume.
What the aggregator actually does
It's not a passthrough. If production-readiness-score only returned {uptime, errors, latency, tls, dns, cert}, the agent still has to interpret six numbers. The aggregator's job is the interpretation.
It calls the sub-probes in parallel, weights the results, applies thresholds, and returns one score plus a small explanation object. The agent gets score: 0.82 and flags: ["cert_expires_in_14_days"]. That's a routing decision it can make in one step.
token-risk-score works the same way. The sub-probes look at holder concentration, contract verification, liquidity depth, and trade-pattern anomalies. None of those answer "is this token safe to interact with?" The aggregator does. It gets called about 9x more than the individual probes.
The Mediakit composite is the cleanest case. An agent buying ad inventory needs reach numbers, engagement quality, audience overlap with a target, and brand-safety signals. Six probes, one decision. Mediakit's composite gets 4x the volume of its highest sub-probe.
Pricing the layers
The aggregator costs more than any single probe but less than the sum. If each probe is $0.002, the composite is $0.008, not $0.012. That gap is the value of synthesis. It's also why agents pick the composite even when they only need three of the six signals. Cheaper than calling three separately.
Don't get cute with the pricing. The composite should always be a discount on the sum. Otherwise you're signaling that the agent should chain probes itself, which defeats the point of building the aggregator.
When this pattern breaks
Two cases.
If the sub-probes are wildly different in cost (one is $0.001, another is $0.05), the aggregator price gets weird and the discount logic stops feeling clean. Break it into two aggregators or surface the expensive probe as a separate paid step.
If the synthesis is genuinely subjective ("which of these matters most depends on what you're optimizing for"), don't ship a single composite score. Ship a structured object with the sub-results and let the agent decide. Pretending to know the caller's weights when you don't is worse than no aggregator at all.
What we're doing next
Every new cluster ships with an aggregator from day one. We learned this the slow way on the first three clusters, where the composites came two months after the probes and the call data immediately shifted. Now the composite is the design anchor. The sub-probes get specced as inputs to the score, not as standalone products that someone might or might not glue together later.
If you're building on x402 and you have more than three related probes, you have an aggregator-shaped hole in your API.