Skip to main content

Real-time Streaming API

Stream live audio for real-time speech-to-text transcription via WebSocket.

Overview

The real-time API allows streaming audio and receiving transcriptions in real-time through WebSocket connections.

Endpoints

Initialize Session

POST /api/v4/realtime/initialize

Check Status

GET /api/v4/realtime/status

WebSocket Connection

Connect to wss://realtime.scriptix.io using:

  • /v2/realtime - For microphone input
  • /v2/client/{token} - For stream input

See WebSocket Connection for complete connection details, query parameters, and examples.

Connection States

The WebSocket connection progresses through the following states:

  • disconnected - Not connected
  • connecting - Establishing connection
  • loading - Connection established, loading model
  • listening - Ready to receive audio (for stream mode)
  • connected - Actively transcribing
  • error - Connection error occurred

Message Protocol

The WebSocket uses JSON messages for control and transcription results. See Message Protocol for complete details.

Client sends:

  • Start command (microphone mode)
  • Binary audio data

Server sends:

  • State messages (loading, listening)
  • Partial results (is_final: false)
  • Final results (is_final: true, includes word timestamps)
  • Error messages

Audio Sources

Two modes supported:

  • Microphone - Stream from client microphone
  • Stream - Server fetches from URL (MP3/M3U8, Azure Blob)

Key Features

  • 15-second connection timeout
  • Real-time partial and final results
  • Word-level timestamps and confidence scores
  • Optional speaker identification
  • Connection state tracking

Notes

  • The WebSocket connection automatically handles state transitions
  • Partial results may change as more audio is processed
  • Final results are confirmed and won't change
  • Words in final results include timing information (start_ms, end_ms) and confidence scores
  • Speaker information is optional and only included when available
  • Connection timeout is set to 15 seconds
  • The implementation uses wss://realtime.scriptix.io as the WebSocket server

Next Steps


Ready to stream? Start with Initialize Session.