{ "type": "<string>", "text": "<string>", "voice_id": "<string>", "language": "<string>", "add_wav_header": true, "sample_rate": 123, "speed": 123, "request_id": "<string>"}
{ "type": "<string>", "connection_id": "<string>", "message": "<string>"}
{ "type": "<string>", "request_id": "<string>", "chunk_index": 123, "data": "<string>"}
{ "type": "<string>", "request_id": "<string>", "total_chunks": 123, "characters": 123}
{ "type": "<string>", "request_id": "<string>", "error": "<string>", "message": "<string>"}
The WebSocket API provides real-time text-to-speech streaming capabilities with high-quality voice synthesis. This API uses WebSocket to deliver audio chunks as they’re generated, enabling low-latency audio playback
Pass API key in Authorization header as Bearer token
Request to synthesize text to speech
Sent immediately after successful WebSocket connection
Streamed audio data. Multiple chunks are sent per request.
Signals all audio has been sent
Indicates a problem with the request