$ man pdf-extract-tables
/pdf-extract-tables(1)
PRICE / CALL
$0.10
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
mediakitCATEGORY
uncategorized
STATUS
● live
NAME
pdf-extract-tables — pdf table extractor / table from pdf / scanned-table parsing / financial-table ocr / multi-page table consolidator / datalab marker tables
SYNOPSIS
POST https://x402.org/v1/pdf-extract-tables
Content-Type: application/json
X-PAYMENT: <signed-transferWithAuthorization>
{ ... }↳ first call →
402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.DESCRIPTION
PDF table extractor / table from PDF / scanned-table parsing / financial-table OCR / multi-page table consolidator / Datalab Marker tables. AI + OCR pipeline that finds every table in a PDF (digital or scanned) and returns row × column text matrices, page-by-page. Optional cell bounding boxes for downstream layout reconstruction. Optional page_range filter ('1-5', '3', '1,3,5'). Handles merged headers, multi-page financial statements, balance sheets, lab results, scanned reports. 30 pages max. Sibling of pdf-to-markdown using the same Datalab backend, but pre-parsed to tables only.
INPUT — request schema
| property | type | description | req? |
|---|---|---|---|
| pdf_url | string | Public URL of a PDF file (http or https). Must be directly fetchable, not behind auth or a viewer redirect. Max 30 pages. | required |
| page_range | string | Optional 1-indexed page filter applied after extraction. Accepts ranges, single pages, or comma-lists: '1-5', '3', '1,3,5'. Default: all pages. | optional |
OUTPUT — response shape
| field | type | description |
|---|---|---|
| source_url | string | Echoes back the PDF URL that was extracted, for traceability. |
| page_count | string | Total number of pages in the input PDF (capped at 30). |
| tables | string | Array of detected tables with page number, row × column text matrix, and optional cell bounding boxes. |
| source | string | Backend identifier, typically 'datalab-marker', indicating the OCR/parsing engine used. |
EXAMPLES — two ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.org/v1/pdf-extract-tables \
-H 'Content-Type: application/json' \
-d '{ }'first response =
402 Payment Required with payment requirements; sign + retry with X-PAYMENT.EXAMPLE 2 · mcp
# install once claude mcp add x402 --command "npx x402-deployer-mcp" # then ask Claude Code: # "use the pdf-extract-tables tool to ..."
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
- tags
- pdftable-extractionocrmediakitdocument-parsingfinancial-tablesdatalabpdf-tables
- methods
- POST
- cluster
- mediakit
- price
- $0.10 USDC per call
ADJACENT — other endpoints in mediakit
| endpoint | description | price |
|---|---|---|
| extract-tables | Extract tables from PDF / table extractor / PDF to CSV / spreadsheet from PDF. | $0.10 |
| mp4-to-mp3 | MP4 → MP3 audio extractor. | $0.10 |
| pdf-to-jpg | PDF to JPG / PNG / WEBP image converter. | $0.10 |
| speaker-diarize | Speaker diarization / who-said-what transcription. | $0.10 |
| transcribe | Video / audio transcription via Whisper v3. | $0.10 |
| upscale-image | AI image upscaler / super-resolution / image enlarger. | $0.10 |
| video-summarize | Video summarizer / podcast summarizer / lecture notes generator. | $0.10 |
| video-to-audio | Video → audio extractor / video to audio converter. | $0.10 |
SEE ALSO