Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint
$ man text-to-speech

/text-to-speech(1)

agentutility / synthforge / text-to-speech
PRICE / CALL
$0.05
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
synthforge
CATEGORY
ai
STATUS
live
NAME
text-to-speech text to speech / tts / voice generator
SYNOPSIS
POST https://x402.org/v1/text-to-speech
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

Text to speech / TTS / voice generator. Venice TTS (Kokoro / xAI / ElevenLabs / Orpheus / MiniMax / Gemini). 30+ voices, 6 audio formats. Returns hosted MP3 URL.

INPUTrequest schema
propertytypedescriptionreq?
textstringMax 4000 chars.required
voicestringDefault 'af_sky'.optional
modelstringDefault 'tts-kokoro'. Other options: tts-xai-v1, tts-elevenlabs-turbo-v2-5, tts-orpheus, etc.optional
speednumber0.25-4. Default 1.optional
formatstringmp3 (default), wav, opus, aac, flac.
enum: mp3 · wav · opus · aac · flac
optional
OUTPUTresponse shape
fieldtypedescription
audio_urlstringHosted MP3 URL pointing to the generated speech audio file.
file_size_bytesnumberSize of the generated MP3 file in bytes.
content_typestringMIME type of the audio file, typically audio/mpeg for MP3 output.
formatstringAudio container format returned, one of 6 supported formats (mp3, opus, aac, flac, wav, pcm).
voicestringVoice identifier used for synthesis, drawn from the 30+ available voices.
modelstringTTS model that produced the audio (Kokoro, xAI, ElevenLabs, Orpheus, MiniMax, or Gemini).
speednumberPlayback speed multiplier applied during synthesis, where 1.0 is normal pace.
input_charsnumberCharacter count of the input text that was synthesized into speech.
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.org/v1/text-to-speech \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# install once
claude mcp add x402 --command "npx x402-deployer-mcp"

# then ask Claude Code:
# "use the text-to-speech tool to ..."
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
ttsspeechaudiovoiceai
env
VENICE_API_KEY · FAL_KEY
methods
POST
cluster
synthforge
price
$0.05 USDC per call
ADJACENTother endpoints in synthforge
endpointdescriptionprice
music-generateMusic generation / text-to-music / AI music / generative song / instrumental and vocal music.$0.05
voiceText-to-speech / TTS / voice synthesis.$0.05
remove-bgAI background remover / background eraser / cutout tool.$0.08
image-editImage edit / instruction-based image edit / text-driven photo edit / nano-banana image editor / GPT-image-2 edit.$0.02
image-inpaintImage inpainting / mask-based image edit / fill in masked region / object replacement / face swap (mask-driven) / generative fill.$0.02
image-generateImage generate (fast/cheap) / text-to-image / AI art.$0.01
sound-effect-generateSound effect generation / text-to-SFX / Foley generator / ElevenLabs sound effects / ambient audio synth.$0.01
image-generate-proImage generate (pro) / premium text-to-image / Flux 2 Pro / Recraft / Seedream / Qwen Image 2 Pro / xAI Grok Imagine.$0.10
SEE ALSO
agentutility(7) · synthforge(7) · x402(7) · mcp(7) · llms.txt · registry.json · bazaar.x402.org