Skip to main content

🔄 Migration Guide (V1 → V2)

This guide explains the differences between Batch API v1 and v2 and what to update when migrating, focusing on session initialization and file upload behavior.


📌 Key Changes

FeatureV1V2
Upload MethodFile sent directly via POST requestTUS protocol with resumable chunked uploads
Workflow1. Create session
2. Upload file or give URL
3. Check status
1. Upload file with language & other properties
2. Check status
Protocol RequirementsContent-Type: application/json header requiredMust comply with TUS protocol
Token RequirementBatch token from scriptix.appSame – batch token from scriptix.app

🌐 Batch API – Upload & Workflow Differences

🧾 Session Initialization

AspectV1V2
EndpointPOST /api/v3/speech-to-text/sessionNot Needed (handled automatically)
Header KeyContent-Type: application/jsonNot Needed
Auth Headerx-zoom-s2t-key: <batch_token>Same
BodyJSON with "language" and optional "media_source"JSON with "language" only. File upload handled separately
Media UploadOptional in request via "media_source"Not part of session request – handled via TUS
Webhook OptionsSupported (webhook_url, webhook_method, webhook_headers)Still supported, now typically passed during file upload via metadata

If you used media_source in v1 to start transcription by URL, this step must now be handled separately using metadata during the TUS upload step.


📤 File Upload Changes

AspectV1V2 (TUS Required)
Upload MethodDirect file upload or via URL in session bodySeparate file upload using the TUS protocol
Upload EndpointCombined with session initPOST https://api.scriptix.io/api/v3/files/
Headersx-zoom-s2t-keySame (x-zoom-s2t-key) or Authorization (API)
Client RequirementNone (standard HTTP POST)Requires TUS client (e.g., Uppy, tus-js-client, tus-py-client)
MetadataSet in JSON body of sessionSet via TUS metadata or .setMeta() depending on client
Chunked Upload❌ Not supported✅ Required — resumable, chunked uploads

🧪 Migration Example

Before (V1) — POST session with media_source:

{
"language": "en",
"media_source": "https://example.com/audio.mp3",
"keep_source": true,
"meta_data": {
"duration": "100"
}
}

After (V2) — Separate session + TUS upload:

1. Session Init:

{
"language": "en"
}

2. Upload with TUS (metadata example)

{
"language": "en",
"keep_source": "true",
"document": "{\"webhook_url\": \"https://example.com/webhook\"}"
}

TUS clients may use this metadata via .setMeta() or headers like Upload-Metadata depending on implementation.

🔐 Token Usage

You must continue using a Batch token, created at:

👉 https://scriptix.app

No token change is needed between v1 and v2.


📥 Retrieving Results Differences

Retrieving results works the same in both v1 and v2, but how you get the session_id differs if you're using automatic session creation in v2.


🔄 V1 vs V2: Session ID Behavior

BehaviorV1V2
Session creationManual — always returns session IDOptional — session created automatically by TUS
Retrieving session IDFrom session creation responseFrom TUS upload response metadata (if not created manually)
Step requiredNoneUse the session_id from your manual session creation or from the TUS upload metadata, then call:
GET /api/v3/speech-to-text/session/{session_id}/result with the same Batch API token used for the upload

✅ Migration Action (Only if skipping manual session creation in v2)

If you're not initializing a session manually in v2:

  1. After uploading a file via TUS, extract the file ID from the Location header.
  2. Call:
GET /api/v3/files/{file_id}
  1. Use the returned session_id to retrieve results as usual:
GET /api/v3/speech-to-text/session/{session_id}/result

No other changes are required — headers, response structure, and status codes remain the same.


🗑️ Removing Results

There are no changes in how results are removed between Batch API v1 and v2.

The same endpoint, headers, and behavior apply:

  • DELETE /api/v3/speech-to-text/session/{session_id}
  • Must use the same batch token that created the session
  • Deletes both the transcript result and filename reference

✅ No migration action required for result removal.


🌐 Realtime API – Connection Differences

The method for connecting via WebSocket remains the same in v1 and v2, but there are small differences in the endpoint format and language parameter handling.


🔄 What's Different

Behaviorv1 Endpointv2 Endpoint
Hostapi.scriptix.iorealtime.scriptix.io
Path/realtime/v2/realtime
Language Parameter?language=xx-yy (required)?language=xx (optional, ISO 639-1 format)
Token ParameterNot required in URL (auth handled via headers)?token=<your_realtime_token> (required)
Language FallbackNot supported✅ Auto-detection if no language provided
Parameter PlacementPath parameterQuery parameter(s) — language optional, token required

✅ Migration Action

  • Update the WebSocket URL from:
wss://api.scriptix.io/realtime?language=nl-nl

to:

wss://realtime.scriptix.io/v2/realtime?token=<your_realtime_token>&language=nl

Use ISO-639-1 codes like nl, en, fr instead of full regional tags like nl-nl

Language is now optional — if not provided, automatic detection will be used.


🔄 Realtime API Protocol – Differences (V1 → V2)

While the message structure and WebSocket flow remain largely the same between versions, the following minor differences exist in protocol properties:


🧾 Start Properties – Differences

PropertyV1V2
actual_numbers✅ Available (default: false)❌ Not available
partial✅ Available (default: true)✅ Available (default: true)

The actual_numbers property was supported in v1 during the "start" action but has been removed in v2 for simplification and standardization. The partial property remains available in v2 and works the same way as in v1.


📘 Learn More