Mediakit: 50 file-format endpoints, one paywall

SUMMARY

Mediakit ships 50 single-purpose file-conversion endpoints across PDF, image, audio, video, OCR, and office. Why 50 endpoints beats one /convert tool with a format flag.

ARTICLE

50 endpoints, one cluster

We launched the Mediakit cluster with 50 endpoints live on Bazaar. PDF, image, audio, video, OCR, office docs, watermarking. Every endpoint is its own URL, its own price, its own input schema, its own listing. Total catalog cost if you ran each one once: $2.807.

The obvious question: why 50 endpoints instead of one /convert that takes a from and to parameter?

Why per-format beats a generic converter

A single /convert endpoint sounds clean until you write the docs.

PDF compression takes a target size or quality level. OCR takes a language hint and a layout mode. Audio loudnorm takes a target LUFS. Video trim takes start and end timestamps. You can't pack that into one input schema without it turning into a discriminated union with 50 branches, and at that point you already have 50 endpoints. You just hid them behind a router.

But agents don't search for /convert. They search the Bazaar for "pdf to markdown" or "extract tables from pdf". Each of those queries should hit a listing with its own price, listing description, and usage stats, plus its own slug. Bundle them and you lose every one of those search hits.

There's also the pricing problem. pdf-to-text is $0.0025 because the current handler is a narrow extraction path. json-yaml is $0.005 because it's pure parsing. A single endpoint with a format flag still has to price every branch together or eat the cost difference between cheap parsing and heavier media jobs. Neither works.

The 50, grouped

PDF gets 17 endpoints. pdf-to-text, pdf-to-markdown, pdf-to-jpg, pdf-extract-tables, pdf-split, pdf-merge, pdf-compress, pdf-watermark, office-to-pdf, html-to-pdf, plus aliased names like pdf2md and compress-pdf that route to the same workers (we kept both naming conventions because agents and human devs guess differently).

Image: 6 endpoints. image-convert (PNG/JPG/WEBP/AVIF), image-upscale at $0.02, image-translate for in-image OCR + replace, logo-detect.

Audio + video: 14 endpoints. audio-transcribe at $0.01 per minute, speaker-diarize at $0.10, video-trim, video-thumbnail, youtube-transcript at $0.01, subtitles, mp4-to-mp3.

Office + data: 10 endpoints. excel-to-csv, xlsx-to-csv, csv-to-jsonl, csv-to-ics, json-yaml, xml-to-word, html-to-markdown.

OCR + receipts: ocr at $0.0025, receipt-ocr at $0.01, receipt-parser at $0.01. The receipt endpoints are tuned models, not general OCR with a prompt. Different price and output schema. Different endpoint.

Calling one

Same x402 shape as the rest of the portfolio. 402 on the first call, then settle and retry.

curl -X POST https://mediakit.agentutility.dev/pdf-to-markdown \
  -H "X-PAYMENT: $PAYMENT_HEADER" \
  -F "file=@invoice.pdf"

The 402 response includes the price ($0.20), the network (base), and the asset (USDC). Pay it with any x402 client, retry with the X-PAYMENT header, get markdown back.

Cost breakdown

If you ran every Mediakit endpoint once today, read the live registry first. Prices move as handlers and upstream costs change. As of this registry snapshot, json-yaml is $0.005, pdf-to-text is $0.0025, and ocr is $0.0025.

For comparison, CloudConvert charges $0.008 per "conversion minute" with a $9/month minimum. Sejda's PDF API starts at $7.50/month for 50 calls. We have no minimum, no monthly. Pay per call. Settle on Base in seconds.

What's next

OCR languages beyond English. A /diff-pdf endpoint for invoice reconciliation. A batched transcribe that takes 10+ files in one settled payment so you save the per-call overhead. If you have a format the catalog doesn't cover, file an issue against the agentutility repo or ping the cluster on the Bazaar.

RELATED — referenced in this article

cluster · mediakit — One endpoint per format. Pay per call.

NAVIGATION

← older (2026-05-12)

Why named LLM tasks beat raw token billing

newer (2026-05-14) →

ERC-8004 in one page: what an agent card actually contains