Name: audio-transcribe
Price: 0.01 USDC
Availability: InStock

$ man audio-transcribe

agentutility / mediakit / audio-transcribe

PRICE / CALL

$0.01

USDC · base mainnet · scheme: exact

METHOD

POST

CLUSTER

mediakit

CATEGORY

uncategorized

STATUS

● live

NAME

audio-transcribe — transcribes audio to text with whisper-large-v3

SYNOPSIS

POST https://x402.agentutility.ai/audio-transcribe
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }

↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.

DESCRIPTION

Transcribes audio to text with whisper-large-v3. Server-side fetches the audio URL (max 25 MB), relays it to Venice's audio/transcriptions endpoint, and returns the transcript with detected language, duration, and per-segment timestamps when response_format='verbose_json' (default). Also supports raw text, SRT, and VTT outputs. Use it as a speech-to-text or multi-language ASR endpoint with OpenAI Whisper API compatibility.

INPUT — request schema

property	type	description	req?
audio_url	string	Public http(s) URL of the audio file (mp3, wav, m4a, ogg, flac, webm). Up to 25 MB.	required
language	string	BCP-47 language hint (e.g. 'en', 'es'). 'auto' or omitted = auto-detect.	optional
model	string	Override the model. Default 'openai/whisper-large-v3'.	optional
response_format	string	Output format. Default 'verbose_json' (transcript + segments). enum: json · text · verbose_json · srt · vtt	optional

OUTPUT — response shape

field	type	description
transcript	string	Full transcribed text of the audio, concatenated across all detected speech segments.
language_detected	string	ISO 639-1 code of the language Whisper auto-detected in the audio (e.g. 'en', 'es', 'fr').
duration_seconds	string	Length of the source audio in seconds, as reported by Whisper after decoding.
segments	string	Array of per-segment objects with start/end timestamps and text, present when response_format is verbose_json.
response_format	string	Output format used: verbose_json (default), json, text, srt, or vtt.
model	string	Whisper model used for transcription, fixed to 'whisper-large-v3' via Venice's audio/transcriptions endpoint.
bytes_in	string	Size in bytes of the audio file fetched from the source URL before relay to Whisper.
source	string	Original audio URL the server fetched and transcribed (echoed back from the request).

EXAMPLES — two ways to call

EXAMPLE 1 · curl

curl -X POST https://x402.agentutility.ai/audio-transcribe \
  -H 'Content-Type: application/json' \
  -d '{ }'

first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.

EXAMPLE 2 · mcp

# Install the MCP package for this endpoint's cluster
npx -y @agentutility/mcp-<cluster>

# Required: EVM private key with USDC on Base
export X402_PRIVATE_KEY=0x...

# Then call the audio-transcribe tool from your MCP-aware agent.

MCP server handles payment automatically — your coding agent just calls the tool by name.

METADATA

tags: mediakitaudiotranscriptionspeech-to-textasrwhispersubtitleswhisper-large-v3
methods: POST
cluster: mediakit
price: $0.01 USDC per call

ADJACENT — other endpoints in mediakit

endpoint	description	price
csv-to-ics	Converts a CSV of events into an RFC 5545 compliant ICS calendar file (VCALENDAR/VEVENT) for Google Calendar, Outlook, and Apple Calendar…	$0.01
image-convert	Universal image format converter (PNG, JPG, WEBP, AVIF, GIF, BMP, TIFF, ICO, HEIC, HEIF, PSD, SVG).	$0.01
image-format-convert	Image converter.	$0.01
merge-pdf	Combines 2-50 input PDFs from URLs into one PDF, preserving bookmarks.	$0.01
movie-database	Finds movies or TV shows by title, with optional year and region, and returns release year, poster, overview, and language.	$0.01
movie-database-api	Searches movies and TV shows by title and optional year, returning release date, rating, popularity, overview, poster URLs, TMDB links, a…	$0.01
movie-info	Looks up movie and TV metadata: title, release year, rating, overview, poster, and optional streaming providers.	$0.01
pdf-merge	Merges 2-50 PDFs from URLs into a single PDF, preserving bookmarks.	$0.01