Migration Guide (V1 → V2)

This guide explains the differences between Batch API v1 and v2 and what to update when migrating, focusing on session initialization and file upload behavior.

Key Changes

Feature	V1	V2
Upload Method	File sent directly via POST request	TUS protocol with resumable chunked uploads
Workflow	1. Create session 2. Upload file or give URL 3. Check status	1. Upload file with language & other properties 2. Check status
Protocol Requirements	`Content-Type: application/json` header required	Must comply with TUS protocol
Token Requirement	Batch token from scriptix.app	Same – batch token from scriptix.app

Batch API – Upload & Workflow Differences

Session Initialization

Aspect	V1	V2
Endpoint	`POST /api/v3/speech-to-text/session`	Not Needed (handled automatically)
Header Key	`Content-Type: application/json`	Not Needed
Auth Header	`x-zoom-s2t-key: <batch_token>`	Same
Body	JSON with `"language"` and optional `"media_source"`	JSON with `"language"` only. File upload handled separately
Media Upload	Optional in request via `"media_source"`	Not part of session request – handled via TUS
Webhook Options	Supported (`webhook_url`, `webhook_method`, `webhook_headers`)	Still supported, now typically passed during file upload via metadata

If you used media_source in v1 to start transcription by URL, this step must now be handled separately using metadata during the TUS upload step.

File Upload Changes

Aspect	V1	V2 (TUS Required)
Upload Method	Direct file upload or via URL in session body	Separate file upload using the TUS protocol
Upload Endpoint	Combined with session init	`POST https://api.scriptix.io/api/v3/files/`
Headers	`x-zoom-s2t-key`	Same (`x-zoom-s2t-key`) or `Authorization` (API)
Client Requirement	None (standard HTTP POST)	Requires TUS client (e.g., Uppy, tus-js-client, tus-py-client)
Metadata	Set in JSON body of session	Set via TUS metadata or `.setMeta()` depending on client
Chunked Upload	❌ Not supported	Required — resumable, chunked uploads

Migration Example

Before (V1) — POST session with `media_source`:

{
  "language": "en",
  "media_source": "https://example.com/audio.mp3",
  "keep_source": true,
  "meta_data": {
    "duration": "100"
  }
}

After (V2) — Separate session + TUS upload:

1. Session Init:

{
  "language": "en"
}

2. Upload with TUS (metadata example)

{
  "language": "en",
  "keep_source": "true",
  "document": "{\"webhook_url\": \"https://example.com/webhook\"}"
}

TUS clients may use this metadata via .setMeta() or headers like Upload-Metadata depending on implementation.

Token Usage

You must continue using a Batch token, created at:

https://scriptix.app

No token change is needed between v1 and v2.

Retrieving Results Differences

Retrieving results works the same in both v1 and v2, but how you get the session_id differs if you're using automatic session creation in v2.

V1 vs V2: Session ID Behavior

Behavior	V1	V2
Session creation	Manual — always returns session ID	Optional — session created automatically by TUS
Retrieving session ID	From session creation response	From TUS upload response metadata (if not created manually)
Step required	None	Use the `session_id` from your manual session creation or from the TUS upload metadata, then call: `GET /api/v3/speech-to-text/session/{session_id}/result` with the same Batch API token used for the upload

Migration Action (Only if skipping manual session creation in v2)

If you're not initializing a session manually in v2:

After uploading a file via TUS, extract the file ID from the Location header.
Call:

GET /api/v3/files/{file_id}

Use the returned session_id to retrieve results as usual:

GET /api/v3/speech-to-text/session/{session_id}/result

No other changes are required — headers, response structure, and status codes remain the same.

Removing Results

There are no changes in how results are removed between Batch API v1 and v2.

The same endpoint, headers, and behavior apply:

DELETE /api/v3/speech-to-text/session/{session_id}
Must use the same batch token that created the session
Deletes both the transcript result and filename reference

No migration action required for result removal.

Realtime API – Connection Differences

The method for connecting via WebSocket remains the same in v1 and v2, but there are small differences in the endpoint format and language parameter handling.

What's Different

Behavior	v1 Endpoint	v2 Endpoint
Host	`api.scriptix.io`	`realtime.scriptix.io`
Path	`/realtime`	`/v2/realtime`
Language Parameter	`?language=xx-yy` (required)	`?language=xx` (optional, ISO 639-1 format)
Token Parameter	Not required in URL (auth handled via headers)	`?token=<your_realtime_token>` (required)
Language Fallback	Not supported	Auto-detection if no language provided
Parameter Placement	Path parameter	Query parameter(s) — `language` optional, `token` required

Migration Action

Update the WebSocket URL from:

wss://api.scriptix.io/realtime?language=nl-nl

to:

wss://realtime.scriptix.io/v2/realtime?token=<your_realtime_token>&language=nl

Use ISO-639-1 codes like nl, en, fr instead of full regional tags like nl-nl

Language is now optional — if not provided, automatic detection will be used.

Realtime API Protocol – Differences (V1 → V2)

While the message structure and WebSocket flow remain largely the same between versions, the following minor differences exist in protocol properties:

Start Properties – Differences

Property	V1	V2
`actual_numbers`	Available (default: `false`)	❌ Not available
`partial`	Available (default: `true`)	Available (default: `true`)

The actual_numbers property was supported in v1 during the "start" action but has been removed in v2 for simplification and standardization. The partial property remains available in v2 and works the same way as in v1.

Document Downloads – Streaming to URL-based Downloads

As of December 2024, we've introduced a more efficient way to download documents.

What Changed

Aspect	Old Method (Deprecated)	New Method (Recommended)
Query Parameter	None (default behavior)	Add `?direct_download=true`
Response	Streams file content through API	Returns unified JSON with download URL + metadata
Performance	Higher server load, slower for large files	Uses pre-generated files, faster downloads
Workflow	Single step: Request → File	Two steps: Request → URL → Download file
URL Lifetime	N/A	1 hour (3600 seconds)
Response Type	Separate response types	Unified `DocumentDownloadUrlResponse`

Why Migrate?

Performance: Pre-generated files download faster
Reliability: Better handling of large files
Scalability: Reduced API server load
Features: Browser-native downloads, CDN support

Affected Endpoints

GET /v3/speech-to-text/{session_id}/document/{document_id}
GET /v3/shared/{session_id}/document/{document_id} (magic links)

Migration Steps

Before (Deprecated)

Request:

GET /api/v3/speech-to-text/session/abc123/document/def456?format=pdf
X-Zoom-S2T-Key: YOUR_TOKEN

Response:

Content-Type: application/pdf
Body: Binary PDF content

Code:

const response = await fetch(url, { headers });
const blob = await response.blob();
// File downloaded directly

After (Recommended)

Request:

GET /api/v3/speech-to-text/session/abc123/document/def456?format=pdf&direct_download=true
X-Zoom-S2T-Key: YOUR_TOKEN

Response:

Content-Type: application/json
Body: Unified DocumentDownloadUrlResponse with metadata

{
  "result": {
    "download_url": "https://scriptixbox.blob.core.windows.net/.../file.pdf?...",
    "expires_in": 3600,
    "filename": "transcript.pdf",
    "content_type": "application/pdf",
    "size_bytes": null,
    "document": null,
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "created": "2024-01-15T10:30:00Z",
    "last_modified": "2024-01-15T11:00:00Z",
    "language": "en",
    "type": "document",
    "timecode_offset": "00:00:00.000",
    "finished": true,
    "use_plain_document": false,
    "plain_document_changed": false
  },
  "count": 0,
  "total_results": 0
}

Code:

// Step 1: Get download URL
const response = await fetch(url + '&direct_download=true', { headers });
const { result } = await response.json();

// Step 2: Download from URL
const fileResponse = await fetch(result.download_url);
const blob = await fileResponse.blob();

// Or redirect browser:
window.location.href = result.download_url;

Backwards Compatibility

The old method still works (omit direct_download or set to false), but is deprecated. Plan to migrate within the next 6 months.

Deprecation Timeline

Direct streaming will be supported until June 2025. After that date, the direct_download=true behavior may become the default.

Unified Response Structure

Response Unification (New)

As of the latest update, both download methods now return the same unified response structure (DocumentDownloadUrlResponse). The response always includes:

Conditional fields: download_url + expires_in (when direct_download=true) OR document (when direct_download=false)
Common fields: filename, content_type, size_bytes
Metadata fields: id, created, last_modified, language, type, timecode_offset, finished, etc.

This means you can now access document metadata regardless of the download method, making client code more consistent.

Format-Specific Notes

All Formats (Including JSON):

Add ?direct_download=true to get download URLs
When direct_download=false, JSON format returns embedded document content in the document field
Update code to handle the unified response structure

Frontend Integration Guide

Update TypeScript Types

Old Separate Types:

// BEFORE: Separate response types
interface DocumentShowResponse {
  result: {
    id: string;
    created: string;
    filename: string;
    type: "document" | "caption";
    document: DocumentSubtitleV1 | DocumentTranscriptV1 | null;
    // Limited metadata...
  };
}

interface DocumentDownloadUrlResponse {
  result: {
    download_url: string;
    expires_in: number;
    filename: string;
    content_type: string;
    size_bytes: number | null;
  };
}

New Unified Type:

// AFTER: Single unified response type
interface DocumentDownloadUrlResponse {
  result: {
    // Conditional fields
    download_url: string | null;          // Present when direct_download=true
    expires_in: number | null;            // Present when direct_download=true
    document: DocumentSubtitleV1 | DocumentTranscriptV1 | null;  // Present when direct_download=false

    // Common fields (always present)
    filename: string;
    content_type: string;
    size_bytes: number | null;

    // Metadata fields (always present)
    id: string;
    created: string;
    last_modified: string;
    language: string | null;
    type: "document" | "caption";
    timecode_offset: string | null;
    finished: boolean | null;
    use_plain_document: boolean | null;
    plain_document_changed: boolean | null;
  };
  count: number;
  total_results: number;
}

Update Client Code

Backward Compatible Approach:

async function getDocument(
  sessionId: string,
  documentId: string,
  format: string = 'json',
  directDownload: boolean = false
) {
  const params = new URLSearchParams({ format });
  if (directDownload) {
    params.append('direct_download', 'true');
  }

  const response = await fetch(
    `/v3/speech-to-text/${sessionId}/document/${documentId}?${params}`,
    { headers: { 'X-Zoom-S2T-Key': 'YOUR_TOKEN' } }
  );

  const data: DocumentDownloadUrlResponse = await response.json();

  if (directDownload && data.result.download_url) {
    // Handle direct download URL
    return {
      url: data.result.download_url,
      expiresIn: data.result.expires_in,
      filename: data.result.filename,
      contentType: data.result.content_type,
      metadata: {
        id: data.result.id,
        created: data.result.created,
        language: data.result.language,
        finished: data.result.finished,
      }
    };
  } else {
    // Handle embedded document content
    return {
      document: data.result.document,  // ✅ Still works!
      filename: data.result.filename,
      metadata: {
        id: data.result.id,
        created: data.result.created,
        language: data.result.language,
        finished: data.result.finished,
      }
    };
  }
}

React Component Example:

function DocumentViewer({ sessionId, documentId }: Props) {
  const [document, setDocument] = useState(null);
  const [metadata, setMetadata] = useState(null);

  useEffect(() => {
    async function loadDocument() {
      const response = await fetch(
        `/v3/speech-to-text/${sessionId}/document/${documentId}?format=json`
      );
      const data: DocumentDownloadUrlResponse = await response.json();

      // Access document content (backward compatible)
      setDocument(data.result.document);

      // NEW: Now you can also access metadata!
      setMetadata({
        id: data.result.id,
        created: data.result.created,
        language: data.result.language,
        finished: data.result.finished,
      });
    }
    loadDocument();
  }, [sessionId, documentId]);

  return (
    <div>
      {metadata && (
        <div>
          <p>Document ID: {metadata.id}</p>
          <p>Created: {new Date(metadata.created).toLocaleString()}</p>
          <p>Language: {metadata.language}</p>
          <p>Status: {metadata.finished ? 'Complete' : 'Processing'}</p>
        </div>
      )}
      {/* Render document content */}
    </div>
  );
}

Edge Cases

What if the URL expires?

URLs expire after 1 hour
Simply make a new API request to get a fresh URL
The document itself remains available

Can I cache the download URL?

Yes, but respect the expires_in value
Re-request if URL is expired or near expiration

What about rate limits?

Requesting a URL counts toward API rate limits
Downloading from the URL does not (it's direct from storage)

Backward compatibility guaranteed?

Yes! Existing code accessing data.result.document continues to work
The unified response is additive (adds fields, doesn't remove them)

Key Changes​

Batch API – Upload & Workflow Differences​

Session Initialization​

File Upload Changes​

Migration Example​

Before (V1) — POST session with media_source:​

1. Session Init:​

2. Upload with TUS (metadata example)​

Token Usage​

Retrieving Results Differences​

V1 vs V2: Session ID Behavior​

Migration Action (Only if skipping manual session creation in v2)​

Removing Results​

Realtime API – Connection Differences​

What's Different​

Migration Action​

Realtime API Protocol – Differences (V1 → V2)​

Start Properties – Differences​

Document Downloads – Streaming to URL-based Downloads​

What Changed​

Why Migrate?​

Affected Endpoints​

Migration Steps​

Before (Deprecated)​

After (Recommended)​

Backwards Compatibility​

Unified Response Structure​

Format-Specific Notes​

Frontend Integration Guide​

Update TypeScript Types​

Update Client Code​

Edge Cases​

Learn More​