WebSocket Connection
To use the Scriptix Real-time API, you must establish a WebSocket connection and stream audio data to receive live transcription results.
This section explains how to initiate, maintain, and close a WebSocket session properly, including key headers and best practices for implementation.
π Connection Endpointβ
Connect to the following secure WebSocket URL:
wss://realtime.scriptix.io/v2/realtime?language=language-code
This connection must remain open throughout your streaming session. The connection supports full-duplex communication between your client and Scriptix's real-time transcription engine.
π§Ύ Required Headers
When initiating the WebSocket connection, include the following headers as query parameters or in the initial upgrade request:
Header | Type | Required | Description |
---|---|---|---|
Authorization | string | β | Token used for authentication. |
Content-Type | string | β | Must be audio/L16;rate=16000 (16-bit PCM, mono). |
language | string | β | Target language for transcription (e.g. en-US ). |
Some organizations may have access to additional flags or language variants. Refer to the Parameter Reference or contact support for custom options.
π§ Connection Lifecycle
1. Open WebSocketβ
Initiate a connection to wss://api.scriptix.io/realtime
with authentication headers.
2. Stream Audioβ
Begin sending audio chunks as binary data. Make sure to match the required encoding format.
3. Receive Transcriptsβ
As audio is processed, youβll receive real-time JSON messages with partial and final transcription segments.
4. Close Gracefullyβ
When finished, close the WebSocket cleanly from your client. Scriptix will finalize the last segments.
β οΈ Audio Streaming Format
- Must use 16-bit linear PCM (
audio/L16
) - Sample rate must be 16,000 Hz
- Mono channel only (1 channel)
- Send small audio chunks (~100ms worth at a time)
Audio that doesnβt meet these requirements may produce poor results or be rejected by the API.
π€ Sending Audio Chunks
Once connected:
- Stream binary chunks via WebSocket in regular intervals (no JSON wrapping)
- Do not compress, encode, or base64 the audio data
- Maintain consistent flowβbuffer underruns and overflows may degrade accuracy
π₯ Receiving Messages
The server will send back JSON-formatted messages such as:
{
"type": "transcript",
"result": {
"text": "Hello, welcome to Scriptix!",
"is_final": true,
"timestamp": 3.2
}
}
β Best Practices
Real-time transcription requires a stable, low-latency connection and consistent audio handling. Follow these best practices to ensure optimal results when integrating with the Scriptix WebSocket API.
π§ Audio Streamingβ
-
Buffer ~100ms of audio per chunk Send small, consistent chunks of audio to balance responsiveness and accuracy.
-
Use
audio/L16
, 16kHz, mono Refer to the Audio Encoding section for required format details.
π§΅ Connection Handlingβ
-
Use a coroutine or thread Handle incoming messages in a separate thread from the one sending audio. This ensures that your client doesnβt block or miss updates.
-
Reconnect gracefully If the connection drops, re-establish it cleanly using a fresh token and resume streaming.
-
Close the socket properly Always close the connection intentionally from your client to receive any final transcript segments.
π§ Next Stepsβ
Explore more implementation details:
- π Protocol β Message formats, types, and status events
- π§ Audio Encoding β Supported formats and conversion tips
- β‘ Performance Tips β Optimizing latency and streaming accuracy