Skip to main content

WebSocket Connection

To use the Scriptix Real-time API, you must establish a WebSocket connection and stream audio data to receive live transcription results.

This section explains how to initiate, maintain, and close a WebSocket session properly, including key headers and best practices for implementation.


πŸ”Œ Connection Endpoint​

Connect to the following secure WebSocket URL:

wss://realtime.scriptix.io/v2/realtime?language=language-code

This connection must remain open throughout your streaming session. The connection supports full-duplex communication between your client and Scriptix's real-time transcription engine.


🧾 Required Headers

When initiating the WebSocket connection, include the following headers as query parameters or in the initial upgrade request:

HeaderTypeRequiredDescription
Authorizationstringβœ…Token used for authentication.
Content-Typestringβœ…Must be audio/L16;rate=16000 (16-bit PCM, mono).
languagestringβœ…Target language for transcription (e.g. en-US).
tip

Some organizations may have access to additional flags or language variants. Refer to the Parameter Reference or contact support for custom options.


🧠 Connection Lifecycle

1. Open WebSocket​

Initiate a connection to wss://api.scriptix.io/realtime with authentication headers.

2. Stream Audio​

Begin sending audio chunks as binary data. Make sure to match the required encoding format.

3. Receive Transcripts​

As audio is processed, you’ll receive real-time JSON messages with partial and final transcription segments.

4. Close Gracefully​

When finished, close the WebSocket cleanly from your client. Scriptix will finalize the last segments.


⚠️ Audio Streaming Format

  • Must use 16-bit linear PCM (audio/L16)
  • Sample rate must be 16,000 Hz
  • Mono channel only (1 channel)
  • Send small audio chunks (~100ms worth at a time)
caution

Audio that doesn’t meet these requirements may produce poor results or be rejected by the API.


πŸ“€ Sending Audio Chunks

Once connected:

  • Stream binary chunks via WebSocket in regular intervals (no JSON wrapping)
  • Do not compress, encode, or base64 the audio data
  • Maintain consistent flowβ€”buffer underruns and overflows may degrade accuracy

πŸ“₯ Receiving Messages

The server will send back JSON-formatted messages such as:

{
"type": "transcript",
"result": {
"text": "Hello, welcome to Scriptix!",
"is_final": true,
"timestamp": 3.2
}
}

βœ… Best Practices

Real-time transcription requires a stable, low-latency connection and consistent audio handling. Follow these best practices to ensure optimal results when integrating with the Scriptix WebSocket API.


🎧 Audio Streaming​

  • Buffer ~100ms of audio per chunk Send small, consistent chunks of audio to balance responsiveness and accuracy.

  • Use audio/L16, 16kHz, mono Refer to the Audio Encoding section for required format details.


🧡 Connection Handling​

  • Use a coroutine or thread Handle incoming messages in a separate thread from the one sending audio. This ensures that your client doesn’t block or miss updates.

  • Reconnect gracefully If the connection drops, re-establish it cleanly using a fresh token and resume streaming.

  • Close the socket properly Always close the connection intentionally from your client to receive any final transcript segments.


🧭 Next Steps​

Explore more implementation details:

  • πŸ‘‰ Protocol β€” Message formats, types, and status events
  • 🎧 Audio Encoding β€” Supported formats and conversion tips
  • ⚑ Performance Tips β€” Optimizing latency and streaming accuracy