Skip to main content

Batch API V1 (Legacy)

DEPRECATED - DO NOT USE

This API version is deprecated and will be removed. Migrate to V3 immediately.

See the Migration Guide for upgrade instructions.

Overview

Legacy three-step process:

  1. Create session with language model
  2. Upload file or provide download URL
  3. Poll for results

V3 uses TUS protocol for better upload reliability. This documentation is for reference only.

Initiate batch session

Initiate a new asynchronous Speech to Text session for a specific language.

URL: POST https://api.scriptix.io/api/v3/speech-to-text/session

Request headers

The following headers need to be present:

ParameterValueDescription
content-typeapplication/json
x-zoom-s2t-keyScriptix Batch API TokenAPI key of type batch needed for authorization

JSON Body Schema

KeyTypeDescription
languagestringSet the language for this session. A list of available languages can be retrieved from the API portal.
webhook_urlstring(Optional) If set a HTTPS callback will be made to a web endpoint once the transcription is done.
webhook_methodstring(Optional) Enum: POST, PUT. Specify the method to use for the HTTP callback. Requires callback_url to be set.
webhook_headersstring[](Optional) Array of headers that needs to be present in the callback request. Requires callback_url to be set.
keep_sourcebooleanDefault: false. If set to true the uploaded file will be saved on scriptix servers to be used in the editors. Files will be re-encoded to low quality.
media_sourcestring(Optional) If set, will immediately start the session by downloading a file on the provided URL
punctuationbooleanDefault: false. If set to true punctuation will be enabled.

JSON Payload

{
"language": "en"
}

Example JSON Payload for URL downloading:

{
"keep_source": true,
"media_source": "https://x-location/x-file.mp3",
"language": "en",
"meta_data": {
"key": "value",
"duration": "100"
}
}

Successful creation

On successful creation an API response with the created session id will be provided.

{
"count": 1,
"total_results": 1,
"result": {
"duration": 0,
"error": "string",
"filename": "string",
"media_type": "string",
"media_url": "string",
"session_id": "string",
"status": "string"
}
}

An example:

{
"count": 1,
"total_results": 1,
"result": {
"duration": 117,
"error": null,
"filename": "https---x-filelocation.com-x-file.mp3",
"media_type": "audio",
"media_url": "https://s3.gra.cloud.ovh.net/scriptix-data-dev/account-1/32d91a4f00000100000256e470-https---...8c722639a3662d2cf2a8349f7",
"session_id": "32d91a4f00000100000256e470",
"status": "uploaded"
}
}

Upload Methods

Decode (Transcription):

PUT /api/v3/speech-to-text/session/${sessionId}

Headers: x-zoom-s2t-key, x-filename, Content-Type: audio/* or video/*

Alignment:

POST /api/v3/speech-to-text/session/${sessionId}/align

Multipart/form-data with media file and transcript. Extra costs may apply.

Retrieving Results

GET /api/v3/speech-to-text/session/${sessionId}/result

Header: x-zoom-s2t-key

Response includes: duration, filename, status, results array with word-level data (word, time_start, time_end, confidence), speaker info, channel.

Deleting Results

DELETE /api/v3/speech-to-text/session/${sessionId}

Erases filename and transcript. Same token required as for creation.

Webhook Callback

V1 supported webhook callbacks on completion. See Webhooks for current webhook documentation in V3.

V1 Webhook Limitations:

  • Only successful transcriptions trigger callbacks
  • No results included in callback (must poll endpoint)
  • 20-second timeout with automatic retry
  • Required HTTPS (no localhost/IPs)

Headers sent:

  • X-Zoom-Session: sessionId
  • X-Scriptix-Session: sessionId
  • Content-Type: application/json