Migrate from Realtime API V1 to V2

This guide helps you upgrade from the Realtime API V1 to V2. V2 offers improved performance, better connection stability, and new features for live transcription applications.

Overview

Why Upgrade to V2?

Performance Improvements:

40% lower latency for transcription results
Better handling of network interruptions
Improved audio quality detection
More efficient WebSocket connection management

New Features:

Partial results for real-time feedback
Confidence scores per word
Speaker diarization in real-time
Support for more audio formats
Better punctuation and capitalization
Custom vocabulary support

Reliability:

Automatic reconnection with state recovery
Better error handling and reporting
Connection health monitoring
Graceful degradation

Timeline

V1 Support: Continues through June 2025
V2 Recommended: For all new implementations
Migration Window: 6 months to migrate existing apps

Key Differences

1. WebSocket URL

V1:

wss://api.scriptix.io/v1/realtime

V2:

wss://api.scriptix.io/v2/realtime

2. Connection Initialization

V1: Simple connection with auth token

const ws = new WebSocket('wss://api.scriptix.io/v1/realtime');

ws.onopen = () => {
  ws.send(JSON.stringify({
    type: 'auth',
    token: 'YOUR_API_KEY'
  }));
};

V2: Session-based with initialization

// Step 1: Create session via REST API
const response = await fetch('https://api.scriptix.io/api/v2/realtime/sessions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${API_KEY}`,
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    language: 'en',
    enable_diarization: true,
    enable_partial_results: true
  })
});

const { session_id, ws_url } = await response.json();

// Step 2: Connect to WebSocket
const ws = new WebSocket(`${ws_url}?session_id=${session_id}`);

3. Message Protocol

V1: Simple message types

// V1 Messages
{
  "type": "audio",
  "data": "base64_encoded_audio"
}

{
  "type": "result",
  "text": "transcribed text"
}

V2: Structured message protocol

// V2 Messages - More detailed
{
  "type": "audio",
  "audio": {
    "data": "base64_encoded_audio",
    "format": "pcm16",
    "sample_rate": 16000
  }
}

{
  "type": "partial_result",
  "result": {
    "text": "transcribed",
    "confidence": 0.95,
    "is_final": false
  }
}

{
  "type": "final_result",
  "result": {
    "text": "transcribed text",
    "confidence": 0.98,
    "words": [
      {
        "word": "transcribed",
        "start_time": 0.0,
        "end_time": 0.5,
        "confidence": 0.97
      },
      {
        "word": "text",
        "start_time": 0.5,
        "end_time": 0.8,
        "confidence": 0.99
      }
    ],
    "speaker_id": 1
  }
}

4. Audio Formats

V1: Limited to PCM16

// Only supported PCM16 at 16kHz

V2: Multiple formats supported

// Supported formats in V2:
// - PCM16 (16kHz, 8kHz)
// - MULAW (8kHz)
// - OPUS (16kHz, 48kHz)

const config = {
  audio_format: "opus",
  sample_rate: 48000
};

5. Partial Results

V1: No partial results - only final transcripts

V2: Real-time partial results

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);

  if (message.type === 'partial_result') {
    // Update UI with partial transcription
    updateTranscript(message.result.text, false);
  } else if (message.type === 'final_result') {
    // Update UI with final transcription
    updateTranscript(message.result.text, true);
  }
};

Migration Steps

Step 1: Update WebSocket URL

Before (V1):

const WS_URL = 'wss://api.scriptix.io/v1/realtime';

After (V2):

const API_URL = 'https://api.scriptix.io/api/v2/realtime';
const WS_BASE_URL = 'wss://api.scriptix.io/v2/realtime';

Step 2: Implement Session Creation

V2 requires creating a session first:

async function createRealtimeSession(config) {
  const response = await fetch(`${API_URL}/sessions`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      language: config.language || 'en',
      enable_diarization: config.enableDiarization || false,
      enable_partial_results: config.enablePartialResults || true,
      audio_format: config.audioFormat || 'pcm16',
      sample_rate: config.sampleRate || 16000
    })
  });

  if (!response.ok) {
    throw new Error(`Session creation failed: ${response.statusText}`);
  }

  return await response.json();
}

Step 3: Update WebSocket Connection

async function connectRealtimeV2(config) {
  // Create session
  const session = await createRealtimeSession(config);

  // Connect to WebSocket
  const ws = new WebSocket(`${WS_BASE_URL}?session_id=${session.session_id}`);

  ws.onopen = () => {
    console.log('Connected to V2 Realtime API');
    // V2 doesn't need separate auth message
  };

  ws.onmessage = (event) => {
    const message = JSON.parse(event.data);
    handleRealtimeMessage(message);
  };

  ws.onerror = (error) => {
    console.error('WebSocket error:', error);
  };

  ws.onclose = (event) => {
    console.log('WebSocket closed:', event.code, event.reason);
    // Implement reconnection logic
  };

  return { ws, sessionId: session.session_id };
}

Step 4: Update Message Handlers

function handleRealtimeMessage(message) {
  switch (message.type) {
    case 'session_started':
      console.log('Session started:', message.session_id);
      break;

    case 'partial_result':
      // V2 feature - real-time updates
      onPartialResult(message.result);
      break;

    case 'final_result':
      // Final transcription
      onFinalResult(message.result);
      break;

    case 'error':
      console.error('Error:', message.error);
      onError(message.error);
      break;

    case 'session_ended':
      console.log('Session ended');
      onSessionEnd();
      break;

    default:
      console.warn('Unknown message type:', message.type);
  }
}

function onPartialResult(result) {
  // Update UI with partial transcription
  document.getElementById('transcript').textContent = result.text;
}

function onFinalResult(result) {
  // Add final transcription to history
  const finalText = result.text;
  const confidence = result.confidence;

  appendToTranscript(finalText, confidence);

  // V2 provides word-level details
  if (result.words) {
    displayWordTimings(result.words);
  }

  // V2 provides speaker information
  if (result.speaker_id !== undefined) {
    updateSpeaker(result.speaker_id);
  }
}

Step 5: Update Audio Sending

V1:

function sendAudio(audioData) {
  ws.send(JSON.stringify({
    type: 'audio',
    data: btoa(String.fromCharCode(...audioData))
  }));
}

V2:

function sendAudioV2(audioData, format = 'pcm16', sampleRate = 16000) {
  const message = {
    type: 'audio',
    audio: {
      data: btoa(String.fromCharCode(...audioData)),
      format: format,
      sample_rate: sampleRate
    }
  };

  ws.send(JSON.stringify(message));
}

Step 6: Implement Reconnection Logic

V2 supports session recovery:

let reconnectAttempts = 0;
const MAX_RECONNECT_ATTEMPTS = 5;

async function reconnect(sessionId, config) {
  if (reconnectAttempts >= MAX_RECONNECT_ATTEMPTS) {
    console.error('Max reconnection attempts reached');
    return;
  }

  reconnectAttempts++;
  console.log(`Reconnecting... (Attempt ${reconnectAttempts})`);

  try {
    // V2 allows reconnecting to existing session
    const ws = new WebSocket(
      `${WS_BASE_URL}?session_id=${sessionId}&reconnect=true`
    );

    ws.onopen = () => {
      console.log('Reconnected successfully');
      reconnectAttempts = 0;
    };

    // ... rest of handlers
  } catch (error) {
    console.error('Reconnection failed:', error);
    setTimeout(() => reconnect(sessionId, config), 2000 * reconnectAttempts);
  }
}

Complete Migration Example

V1 Implementation

// V1 Realtime Client
class RealtimeClientV1 {
  constructor(apiKey, language = 'en') {
    this.apiKey = apiKey;
    this.language = language;
    this.ws = null;
  }

  connect() {
    this.ws = new WebSocket('wss://api.scriptix.io/v1/realtime');

    this.ws.onopen = () => {
      // Authenticate
      this.ws.send(JSON.stringify({
        type: 'auth',
        token: this.apiKey,
        language: this.language
      }));
    };

    this.ws.onmessage = (event) => {
      const message = JSON.parse(event.data);

      if (message.type === 'result') {
        this.onResult(message.text);
      }
    };
  }

  sendAudio(audioData) {
    if (this.ws && this.ws.readyState === WebSocket.OPEN) {
      this.ws.send(JSON.stringify({
        type: 'audio',
        data: btoa(String.fromCharCode(...audioData))
      }));
    }
  }

  disconnect() {
    if (this.ws) {
      this.ws.close();
    }
  }

  onResult(text) {
    // Override this method
    console.log('Result:', text);
  }
}

V2 Implementation

// V2 Realtime Client
class RealtimeClientV2 {
  constructor(apiKey, config = {}) {
    this.apiKey = apiKey;
    this.config = {
      language: config.language || 'en',
      enableDiarization: config.enableDiarization || false,
      enablePartialResults: config.enablePartialResults !== false,
      audioFormat: config.audioFormat || 'pcm16',
      sampleRate: config.sampleRate || 16000
    };
    this.ws = null;
    this.sessionId = null;
  }

  async connect() {
    try {
      // Create session
      const response = await fetch(
        'https://api.scriptix.io/api/v2/realtime/sessions',
        {
          method: 'POST',
          headers: {
            'Authorization': `Bearer ${this.apiKey}`,
            'Content-Type': 'application/json'
          },
          body: JSON.stringify(this.config)
        }
      );

      if (!response.ok) {
        throw new Error(`Session creation failed: ${response.statusText}`);
      }

      const session = await response.json();
      this.sessionId = session.session_id;

      // Connect WebSocket
      this.ws = new WebSocket(
        `wss://api.scriptix.io/v2/realtime?session_id=${this.sessionId}`
      );

      this.ws.onopen = () => {
        console.log('Connected to Realtime V2');
        this.onConnect();
      };

      this.ws.onmessage = (event) => {
        const message = JSON.parse(event.data);
        this.handleMessage(message);
      };

      this.ws.onerror = (error) => {
        console.error('WebSocket error:', error);
        this.onError(error);
      };

      this.ws.onclose = (event) => {
        console.log('WebSocket closed:', event.code, event.reason);
        this.onDisconnect();
      };

    } catch (error) {
      console.error('Connection failed:', error);
      throw error;
    }
  }

  handleMessage(message) {
    switch (message.type) {
      case 'session_started':
        this.onSessionStarted(message);
        break;

      case 'partial_result':
        this.onPartialResult(message.result);
        break;

      case 'final_result':
        this.onFinalResult(message.result);
        break;

      case 'error':
        this.onError(message.error);
        break;

      case 'session_ended':
        this.onSessionEnded();
        break;
    }
  }

  sendAudio(audioData) {
    if (this.ws && this.ws.readyState === WebSocket.OPEN) {
      const message = {
        type: 'audio',
        audio: {
          data: btoa(String.fromCharCode(...audioData)),
          format: this.config.audioFormat,
          sample_rate: this.config.sampleRate
        }
      };

      this.ws.send(JSON.stringify(message));
    }
  }

  async disconnect() {
    if (this.ws) {
      this.ws.close();
    }

    if (this.sessionId) {
      // End session via API
      await fetch(
        `https://api.scriptix.io/api/v2/realtime/sessions/${this.sessionId}`,
        {
          method: 'DELETE',
          headers: {
            'Authorization': `Bearer ${this.apiKey}`
          }
        }
      );
    }
  }

  // Override these methods
  onConnect() {}
  onDisconnect() {}
  onSessionStarted(data) {}
  onPartialResult(result) {
    console.log('Partial:', result.text);
  }
  onFinalResult(result) {
    console.log('Final:', result.text, `(${result.confidence})`);
  }
  onError(error) {
    console.error('Error:', error);
  }
  onSessionEnded() {}
}

// Usage
const client = new RealtimeClientV2('YOUR_API_KEY', {
  language: 'en',
  enablePartialResults: true,
  enableDiarization: true
});

client.onFinalResult = (result) => {
  console.log('Transcription:', result.text);
  console.log('Confidence:', result.confidence);
  console.log('Speaker:', result.speaker_id);
};

await client.connect();

Feature Mapping

V1 → V2 Feature Comparison

Feature	V1	V2	Notes
Basic transcription	✅	✅	Same
Partial results	❌	✅	New in V2
Word-level timestamps	❌	✅	New in V2
Confidence scores	❌	✅	New in V2
Speaker diarization	❌	✅	New in V2
Multiple audio formats	❌	✅	V1 only PCM16
Session recovery	❌	✅	New in V2
Custom vocabulary	❌	✅	New in V2

Best Practices for V2

1. Handle Partial Results

let currentPartialText = '';

client.onPartialResult = (result) => {
  // Show partial result in real-time
  currentPartialText = result.text;
  updateLiveTranscript(currentPartialText);
};

client.onFinalResult = (result) => {
  // Replace partial with final
  appendFinalTranscript(result.text);
  currentPartialText = '';
};

2. Monitor Connection Health

setInterval(() => {
  if (client.ws.readyState === WebSocket.OPEN) {
    client.ws.send(JSON.stringify({ type: 'ping' }));
  }
}, 30000); // Ping every 30 seconds

3. Handle Errors Gracefully

client.onError = (error) => {
  if (error.code === 'INSUFFICIENT_CREDITS') {
    showWarning('Low balance - please add credits');
  } else if (error.code === 'UNSUPPORTED_LANGUAGE') {
    showError('Selected language not supported');
  } else {
    showError(`Error: ${error.message}`);
  }
};

Testing Your Migration

Test Checklist

Performance Comparison

Latency Improvements

Metric	V1	V2	Improvement
First result	1.5s	0.8s	47% faster
Avg latency	800ms	450ms	44% faster
Connection time	500ms	300ms	40% faster

Common Issues

Issue 1: Session Creation Fails

Cause: Invalid API key or configuration

Solution:

try {
  await client.connect();
} catch (error) {
  console.error('Connection failed:', error.message);
  // Check API key and config
}

Issue 2: No Partial Results

Cause: Partial results not enabled

Solution:

const client = new RealtimeClientV2(apiKey, {
  enablePartialResults: true  // Make sure this is true
});

Issue 3: Audio Not Transcribed

Cause: Incorrect audio format

Solution:

// Ensure format matches your audio source
const client = new RealtimeClientV2(apiKey, {
  audioFormat: 'pcm16',    // Match your audio format
  sampleRate: 16000        // Match your sample rate
});

Support

Need help migrating?

Overview​

Why Upgrade to V2?​

Timeline​

Key Differences​

1. WebSocket URL​

2. Connection Initialization​

3. Message Protocol​

4. Audio Formats​

5. Partial Results​

Migration Steps​

Step 1: Update WebSocket URL​

Step 2: Implement Session Creation​

Step 3: Update WebSocket Connection​

Step 4: Update Message Handlers​

Step 5: Update Audio Sending​

Step 6: Implement Reconnection Logic​

Complete Migration Example​

V1 Implementation​

V2 Implementation​

Feature Mapping​

V1 → V2 Feature Comparison​

Best Practices for V2​

1. Handle Partial Results​

2. Monitor Connection Health​

3. Handle Errors Gracefully​

Testing Your Migration​

Test Checklist​

Performance Comparison​

Latency Improvements​

Common Issues​

Issue 1: Session Creation Fails​

Issue 2: No Partial Results​

Issue 3: Audio Not Transcribed​

Support​

Next Steps​

Overview

Why Upgrade to V2?

Timeline

Key Differences

1. WebSocket URL

2. Connection Initialization

3. Message Protocol

4. Audio Formats

5. Partial Results

Migration Steps

Step 1: Update WebSocket URL

Step 2: Implement Session Creation

Step 3: Update WebSocket Connection

Step 4: Update Message Handlers

Step 5: Update Audio Sending

Step 6: Implement Reconnection Logic

Complete Migration Example

V1 Implementation

V2 Implementation

Feature Mapping

V1 → V2 Feature Comparison

Best Practices for V2

1. Handle Partial Results

2. Monitor Connection Health

3. Handle Errors Gracefully

Testing Your Migration

Test Checklist

Performance Comparison

Latency Improvements

Common Issues

Issue 1: Session Creation Fails

Issue 2: No Partial Results

Issue 3: Audio Not Transcribed

Support

Next Steps