Skip to content
clusters: prooflayer · edgemarket · edgefinance · synthforge · mediakit · wordmint · webprobe · locale · comppoint
$ man voice

/voice(1)

PRICE / CALL
$0.05
USDC · base mainnet · scheme: exact
METHOD
POST
CLUSTER
synthforge
CATEGORY
ai
STATUS
live
NAME
voice text-to-speech / tts / voice synthesis
SYNOPSIS
POST https://x402.org/v1/voice
     Content-Type: application/json
     X-PAYMENT:    <signed-transferWithAuthorization>

     { ... }
↳ first call → 402 Payment Required. Sign USDCtransferWithAuthorization, retry with theX-PAYMENT header.
DESCRIPTION

Text-to-speech / TTS / voice synthesis. Venice TTS (Kokoro/xAI/ElevenLabs/Orpheus/MiniMax). 30+ voices, MP3/WAV/OPUS/AAC/FLAC.

INPUTrequest schema
propertytypedescriptionreq?
textstringMax 4000 chars.required
voicestringDefault 'af_sky'.optional
modelstringDefault 'tts-kokoro'. Other options: tts-xai-v1, tts-elevenlabs-turbo-v2-5, tts-orpheus, etc.optional
speednumber0.25-4. Default 1.optional
formatstringmp3 (default), wav, opus, aac, flac.optional
OUTPUTresponse shape
fieldtypedescription
audio_urlstringSigned URL to the generated audio file hosted on R2 or similar object storage.
file_size_bytesnumberSize of the generated audio file in bytes.
content_typestringMIME type of the audio response, like audio/mpeg or audio/wav.
formatstringAudio container/codec returned: mp3, wav, opus, aac, or flac.
voicestringVoice ID used for synthesis from the 30+ Venice TTS voices.
modelstringTTS model that produced the audio: Kokoro, xAI, ElevenLabs, Orpheus, or MiniMax.
speednumberPlayback speed multiplier applied during synthesis, where 1.0 is normal pace.
input_charsnumberNumber of characters in the input text that were synthesized.
EXAMPLEStwo ways to call
EXAMPLE 1 · curl
curl -X POST https://x402.org/v1/voice \
  -H 'Content-Type: application/json' \
  -d '{ }'
first response = 402 Payment Required with payment requirements; sign + retry with X-PAYMENT.
EXAMPLE 2 · mcp
# install once
claude mcp add x402 --command "npx x402-deployer-mcp"

# then ask Claude Code:
# "use the voice tool to ..."
MCP server handles payment automatically — your coding agent just calls the tool by name.
METADATA
tags
ttsspeechaudiovoiceai
env
VENICE_API_KEY · FAL_KEY
methods
POST
cluster
synthforge
price
$0.05 USDC per call
ADJACENTother endpoints in synthforge
endpointdescriptionprice
music-generateMusic generation / text-to-music / AI music / generative song / instrumental and vocal music.$0.05
text-to-speechText to speech / TTS / voice generator.$0.05
remove-bgAI background remover / background eraser / cutout tool.$0.08
image-editImage edit / instruction-based image edit / text-driven photo edit / nano-banana image editor / GPT-image-2 edit.$0.02
image-inpaintImage inpainting / mask-based image edit / fill in masked region / object replacement / face swap (mask-driven) / generative fill.$0.02
image-generateImage generate (fast/cheap) / text-to-image / AI art.$0.01
sound-effect-generateSound effect generation / text-to-SFX / Foley generator / ElevenLabs sound effects / ambient audio synth.$0.01
image-generate-proImage generate (pro) / premium text-to-image / Flux 2 Pro / Recraft / Seedream / Qwen Image 2 Pro / xAI Grok Imagine.$0.10
SEE ALSO
agentutility(7) · synthforge(7) · x402(7) · mcp(7) · llms.txt · registry.json · bazaar.x402.org