Text to Speech (WebSocket)

Overview

The WebSocket endpoint enables real-time audio streaming with low-latency. Unlike the REST API, audio chunks are delivered as they’re generated, making it ideal for:

Live voice assistants
Real-time call center applications
Interactive voice response (IVR) systems
Any application requiring immediate audio feedback

Authentication

Pass your API key in the Authorization header as a Bearer token when establishing the WebSocket connection.

const ws = new WebSocket('wss://api.tts.timepay.ai/api/v1/get_speech', {
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY'
  }
});

Always use wss:// (secure WebSocket) in production environments.

Concurrency Model

Users can open multiple connections. The concurrency limit only applies when actively processing speech requests.

For example, if your plan allows 5 concurrent requests:

You can maintain 10 open connections
Only 5 can generate speech simultaneously
Additional requests will throw 429 rate limit error

Request Format

type

string

required

Must be "speech" for TTS requests.

text

string

required

The text to convert to speech.

voice_id

string

required

The specific voice ID to use for synthesis. See available voices below.

language

string

default:"en"

ISO language code for the speech output.Supported languages: en, hi, mr, ta, te, gu, kn, ml, bn, pa, od, as

add_wav_header

boolean

default:true

If true, adds a WAV header to the audio stream for immediate playback.

sample_rate

number

default:24000

Audio sample rate in Hz.Supported values: 8000, 16000, 24000

speed

number

Playback speed multiplier. Range: 0.5 to 2.0, where 1.0 is normal speed.

request_id

string

Optional custom identifier for tracking requests. Auto-generated if not provided.

Available Voices

Voice Name	Voice ID
Kartik	`Ogbs15oBevLzXsUuTtA1`
Rahul	`Owbs15oBevLzXsUurdA_`
Nisha	`PAbs15oBevLzXsUu4dCi`
Tulsi	`PQbt15oBevLzXsUuNtD3`
Seema	`Pgbt15oBevLzXsUubdA6`

Example Request

{
  "type": "speech",
  "request_id": "req_12345",
  "text": "Hello, welcome to Vox.",
  "voice_id": "Ogbs15oBevLzXsUuTtA1",
  "language": "en",
  "add_wav_header": true,
  "sample_rate": 24000,
  "speed": 1.0
}

Minimal Request

{
  "type": "speech",
  "text": "Hello, welcome to Vox.",
  "voice_id": "PAbs15oBevLzXsUu4dCi"
}

Response Messages

The server sends multiple message types during a speech generation request:

Connection Established

Sent immediately after successful authentication.

type

string

Always "connected"

connection_id

string

Unique identifier for this WebSocket connection.

{
  "type": "connected",
  "connection_id": "conn_abc123",
  "message": "Connected successfully"
}

Audio Chunks

Streamed audio data. Multiple chunks are sent per request.

type

string

Always "audio_chunk"

request_id

string

The request identifier.

chunk_index

integer

Zero-indexed position in the audio stream.

data

string

Base64-encoded audio data (PCM 16-bit, 24kHz mono by default).

{
  "type": "audio_chunk",
  "request_id": "req_12345",
  "chunk_index": 0,
  "data": "UklGRiQAAABXQVZFZm10IBAAAAABAAEAQB8AAEAfAAABAAgA..."
}

Request Complete

Signals all audio has been sent.

type

string

Always "complete"

request_id

string

The request identifier.

total_chunks

integer

Total number of audio chunks sent.

characters

integer

Characters processed (for billing verification).

{
  "type": "complete",
  "request_id": "req_12345",
  "total_chunks": 15,
  "characters": 47
}

Error

Indicates a problem with the request.

type

string

Always "error"

request_id

string

The request identifier (if available).

error

string

Error code for programmatic handling.

message

string

Human-readable error description.

{
  "type": "error",
  "request_id": "req_12345",
  "error": "concurrency_limit_exceeded",
  "message": "Maximum concurrent requests (5) exceeded"
}

Error Codes

Code	Description
`invalid_request`	Malformed JSON or missing required fields
`invalid_voice`	Specified voice not found
`invalid_language`	Unsupported language code
`text_too_long`	Text exceeds maximum character limit
`concurrency_limit_exceeded`	Too many simultaneous requests
`insufficient_credits`	Account balance too low
`internal_error`	Server-side processing error

Complete Example

import websockets

async def connect():
    headers = {
        'Authorization': 'Bearer YOUR_API_KEY'
    }
    async with websockets.connect(
        'wss://api.tts.timepay.ai/api/v1/get_speech',
        extra_headers=headers
    ) as ws:
        # Your code here
        pass

Best Practices

Connection Management

Reuse WebSocket connections for multiple requests
Implement automatic reconnection with exponential backoff
Send periodic pings to keep connections alive (every 30 seconds)
Close connections gracefully when no longer needed

Audio Handling

Buffer audio chunks before playback for smoother experience
Default audio format: PCM 16-bit, 24kHz, mono
Use Web Audio API for browser playback
Consider using a streaming audio player for real-time playback

Error Handling

Always handle the error message type
Implement request timeouts (recommended: 30 seconds)
Queue requests when concurrency limit is reached
Log request_id for debugging and support

API documentation

Endpoint examples

Text to Speech (WebSocket)

Overview

Authentication

Concurrency Model

Request Format

Available Voices

Response Messages

Connection Established

Audio Chunks

Request Complete

Error

Error Codes

Complete Example

Best Practices

API documentation

Endpoint examples

​Overview

​Authentication

​Concurrency Model

​Request Format

​Available Voices

​Response Messages

​Connection Established

​Audio Chunks

​Request Complete

​Error

​Error Codes

​Complete Example

​Best Practices

Overview

Authentication

Concurrency Model

Request Format

Available Voices

Response Messages

Connection Established

Audio Chunks

Request Complete

Error

Error Codes

Complete Example

Best Practices