$ man text-to-speech
/text-to-speech(1)
PRICE / CALL
$0.05
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
synthforgeCATEGORY
ai
STATUS
● live
NAME
text-to-speech — text to speech / tts / voice generator
SYNOPSIS
POST https://x402.org/v1/text-to-speech
Content-Type: application/json
X-PAYMENT: <signed-transferWithAuthorization>
{ ... }↳ first call →
402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.DESCRIPTION
Text to speech / TTS / voice generator. Venice TTS (Kokoro / xAI / ElevenLabs / Orpheus / MiniMax / Gemini). 30+ voices, 6 audio formats. Returns hosted MP3 URL.
INPUT — request schema
| property | type | description | req? |
|---|---|---|---|
| text | string | Max 4000 chars. | required |
| voice | string | Default 'af_sky'. | optional |
| model | string | Default 'tts-kokoro'. Other options: tts-xai-v1, tts-elevenlabs-turbo-v2-5, tts-orpheus, etc. | optional |
| speed | number | 0.25-4. Default 1. | optional |
| format | string | mp3 (default), wav, opus, aac, flac. enum: mp3 · wav · opus · aac · flac | optional |
OUTPUT — response shape
| field | type | description |
|---|---|---|
| audio_url | string | Hosted MP3 URL pointing to the generated speech audio file. |
| file_size_bytes | number | Size of the generated MP3 file in bytes. |
| content_type | string | MIME type of the audio file, typically audio/mpeg for MP3 output. |
| format | string | Audio container format returned, one of 6 supported formats (mp3, opus, aac, flac, wav, pcm). |
| voice | string | Voice identifier used for synthesis, drawn from the 30+ available voices. |
| model | string | TTS model that produced the audio (Kokoro, xAI, ElevenLabs, Orpheus, MiniMax, or Gemini). |
| speed | number | Playback speed multiplier applied during synthesis, where 1.0 is normal pace. |
| input_chars | number | Character count of the input text that was synthesized into speech. |
EXAMPLES — two ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.org/v1/text-to-speech \
-H 'Content-Type: application/json' \
-d '{ }'first response =
402 Payment Required with payment requirements; sign + retry with X-PAYMENT.EXAMPLE 2 · mcp
# install once claude mcp add x402 --command "npx x402-deployer-mcp" # then ask Claude Code: # "use the text-to-speech tool to ..."
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
- tags
- ttsspeechaudiovoiceai
- env
- VENICE_API_KEY · FAL_KEY
- methods
- POST
- cluster
- synthforge
- price
- $0.05 USDC per call
ADJACENT — other endpoints in synthforge
| endpoint | description | price |
|---|---|---|
| music-generate | Music generation / text-to-music / AI music / generative song / instrumental and vocal music. | $0.05 |
| voice | Text-to-speech / TTS / voice synthesis. | $0.05 |
| remove-bg | AI background remover / background eraser / cutout tool. | $0.08 |
| image-edit | Image edit / instruction-based image edit / text-driven photo edit / nano-banana image editor / GPT-image-2 edit. | $0.02 |
| image-inpaint | Image inpainting / mask-based image edit / fill in masked region / object replacement / face swap (mask-driven) / generative fill. | $0.02 |
| image-generate | Image generate (fast/cheap) / text-to-image / AI art. | $0.01 |
| sound-effect-generate | Sound effect generation / text-to-SFX / Foley generator / ElevenLabs sound effects / ambient audio synth. | $0.01 |
| image-generate-pro | Image generate (pro) / premium text-to-image / Flux 2 Pro / Recraft / Seedream / Qwen Image 2 Pro / xAI Grok Imagine. | $0.10 |
SEE ALSO