Retrieve a document
Use correct API token
It is important that the correct API token is used for document retrieval. Only the token linked to the TranscriptSession can be used.
IMPORTANT: Recommended Download Method
Direct streaming of document content is deprecated. For better performance and reliability, use the URL-based download method by setting direct_download=true.
Why switch?
- Reduces API server load
- Uses pre-generated files when available (faster)
- Enables browser-native download features
- Supports larger files more reliably
See URL Download Method below for details.
URL
GET https://api.scriptix.io/api/v3/speech-to-text/session/${sessionId}/document/${documentId}
Request headers
The following headers need to be present:
| Parameter | Value | Description |
|---|---|---|
| X-Zoom-S2T-Key | Scriptix Batch API Token | API key belonging to TranscriptSession |
Request path arguments
| Argument | Description |
|---|---|
| sessionId | Scriptix Transcript Session ID returned from a Batch Session |
| documentId | Scriptix Document ID returned from creating a document |
Request query parameters
| Key | Type | Default | Description |
|---|---|---|---|
| format | string | json | Returns the document in the specified format. See Supported Document Formats. |
| direct_download | boolean | false | RECOMMENDED: Set to trueWhen true, returns a time-limited download URL instead of embedded content. See URL Download Method. |
Both download methods (direct_download=true and direct_download=false) now return the same unified DocumentDownloadUrlResponse structure. The response always includes document metadata, with conditional fields based on the download method selected.
Supported Document Formats
The following formats are supported when retrieving a document using the format query parameter:
| Format | Description |
|---|---|
json | Scriptix ExtendedDocumentModel JSON response |
sbv | SBV subtitle format |
srt | SRT subtitle format |
ttml | TTML subtitle format |
vtt | VTT subtitle format |
docx | Microsoft Word document |
template | AI generated template (either with a specific format or summarization) |
stl | Subtitle format for broadcast |
html | HTML document |
Response Codes
| Status code | Description |
|---|---|
| 200 | Document in requested format |
| 400 | Bad request |
| 401 | Unauthorized — no valid authentication found |
| 403 | Forbidden — access to resource is not allowed |
| 404 | Not found — the transcript session is not found or doesn't match the API token |
Response Methods
There are two ways to retrieve document content:
- URL Download (RECOMMENDED) - Returns a time-limited download URL
- Direct Streaming (DEPRECATED) - Streams content directly through API
URL Download Method (RECOMMENDED)
When to Use
- Recommended for all production use cases
- Best for files that will be downloaded by browsers
- Optimal performance with pre-generated files
- Reduced server load and faster response times
Request Example
GET /api/v3/speech-to-text/session/{sessionId}/document/{documentId}?format=pdf&direct_download=true
X-Zoom-S2T-Key: YOUR_BATCH_API_TOKEN
Response: HTTP 200 OK
Content-Type: application/json
Response Format: APIResultResponse<DocumentDownloadUrlResponse>
{
"result": {
"download_url": "https://scriptixbox.blob.core.windows.net/documents/abc123/transcript.pdf?sv=2021-08-06&se=2025-12-07T15:30:00Z&sr=b&sp=r&sig=...",
"expires_in": 3600,
"filename": "transcript.pdf",
"content_type": "application/pdf",
"size_bytes": null,
"document": null,
"created": "2024-01-15T10:30:00Z",
"language": "en",
"id": "123e4567-e89b-12d3-a456-426614174000",
"last_modified": "2024-01-15T11:00:00Z",
"type": "document",
"timecode_offset": "00:00:00.000",
"finished": false,
"use_plain_document": false,
"plain_document_changed": false
},
"count": 0,
"total_results": 0
}
Response Schema
| Field | Type | Description |
|---|---|---|
URL Fields (when direct_download=true) | ||
| download_url | string | null | Time-limited Azure SAS URL for downloading the file (null when direct_download=false) |
| expires_in | integer | null | URL expiration time in seconds, typically 3600 = 1 hour (null when direct_download=false) |
Document Field (when direct_download=false) | ||
| document | object | null | Embedded document content (null when direct_download=true) |
| Common Fields (always present) | ||
| filename | string | Original filename with proper extension |
| content_type | string | MIME type of the document |
| size_bytes | integer | null | File size in bytes |
| Metadata Fields (always present) | ||
| id | string | Document unique identifier |
| created | string | ISO 8601 timestamp of document creation |
| last_modified | string | ISO 8601 timestamp of last modification |
| language | string | null | Document language code (e.g., "en") |
| type | string | Document type: "document" or "caption" |
| timecode_offset | string | null | Timecode offset in format "HH:MM:SS.mmm" |
| finished | boolean | null | Whether document processing is complete |
| use_plain_document | boolean | null | Whether plain document format is used |
| plain_document_changed | boolean | null | Whether plain document has been modified |
Download URLs expire after 1 hour (3600 seconds). If the URL expires, make a new API request to get a fresh URL. The actual document file remains available.
Implementation Example
Using Direct Download (Recommended)
const response = await fetch(
'https://api.scriptix.io/api/v3/speech-to-text/session/SESSION_ID/document/DOC_ID?format=pdf&direct_download=true',
{
headers: {
'X-Zoom-S2T-Key': 'YOUR_API_TOKEN'
}
}
);
const { result } = await response.json();
// Access download URL (populated when direct_download=true)
if (result.download_url) {
console.log('Download URL:', result.download_url);
console.log('Expires in:', result.expires_in, 'seconds');
// Access metadata (always available in unified response)
console.log('Document ID:', result.id);
console.log('Created:', result.created);
console.log('Language:', result.language);
console.log('Filename:', result.filename);
// Download the file
window.location.href = result.download_url;
// Or fetch and process
const fileResponse = await fetch(result.download_url);
const blob = await fileResponse.blob();
// Process blob...
}
Using Embedded Document (Backward Compatible)
const response = await fetch(
'https://api.scriptix.io/api/v3/speech-to-text/session/SESSION_ID/document/DOC_ID?format=json',
{
headers: {
'X-Zoom-S2T-Key': 'YOUR_API_TOKEN'
}
}
);
const { result } = await response.json();
// Access embedded document (populated when direct_download=false)
if (result.document) {
console.log('Document content:', result.document);
// Access metadata (now also available with embedded documents)
console.log('Document ID:', result.id);
console.log('Created:', result.created);
console.log('Language:', result.language);
console.log('Processing finished:', result.finished);
}
Direct Streaming Method (DEPRECATED)
This method is deprecated and maintained only for backwards compatibility. Use the URL Download Method instead.
When Used
- Only when
direct_download=false(default for backwards compatibility) - Not recommended for new implementations
Request Example
GET /api/v3/speech-to-text/session/{sessionId}/document/{documentId}?format=pdf
X-Zoom-S2T-Key: YOUR_BATCH_API_TOKEN
Response: HTTP 200 OK (Embedded Document)
When direct_download=false, the response uses the unified structure with embedded document content:
Content-Type: application/json
Response Format: APIResultResponse<DocumentDownloadUrlResponse>
{
"result": {
"document": [
{
"start": 525,
"stop": 6252,
"speaker": "M1",
"text": "This is just an example.\non two lines."
}
],
"filename": "Example document.json",
"content_type": "application/json",
"size_bytes": null,
"download_url": null,
"expires_in": null,
"id": "12780b54-5575-44e4-bbca-0a410b432183",
"created": "2021-01-25T10:31:46.960411+00:00",
"last_modified": "2021-01-25T10:31:46.960411+00:00",
"language": "en",
"type": "caption",
"timecode_offset": "00:00:00.000",
"finished": true,
"use_plain_document": false,
"plain_document_changed": false
},
"count": 0,
"total_results": 0
}
Existing code that accesses data.result.document continues to work. The unified response now also includes full metadata fields that were previously unavailable in the embedded mode.
Response: HTTP 200 OK (Other Formats)
For all non-JSON formats (pdf, docx, srt, vtt, etc.), the response streams binary content:
Content-Type: Varies by format (e.g., application/pdf, text/plain, application/vnd.openxmlformats-officedocument.wordprocessingml.document)
Response Body: Raw file content (binary stream)
Headers:
Content-Disposition: attachment; filename="document.{ext}"Content-Type: {mime-type}
Streaming large files directly through the API can:
- Increase server load
- Result in slower downloads
- Miss opportunities for CDN caching
- Require keeping HTTP connections open longer
Format Examples
The following examples show the content structure of various document formats. These examples represent the file content you'll receive either:
- Via the URL returned when using
direct_download=true(recommended) - Through direct streaming when using
direct_download=false(deprecated)
TTML Example Output
Below is an example of a TTML subtitle document generated by Scriptix:
<?xml version="1.0" encoding="UTF-8"?>
<tt xmlns="http://www.w3.org/ns/ttml" xml:lang="en-us">
<head>
<metadata xmlns:ttm="http://www.w3.org/ns/ttml#metadata">
<ttm:title>Scriptix TTML</ttm:title>
</metadata>
<styling xmlns:tts="http://www.w3.org/ns/ttml#styling">
<style xml:id="s1" tts:textAlign="center" tts:fontFamily="Arial" tts:fontSize="100%"/>
</styling>
<layout xmlns:tts="http://www.w3.org/ns/ttml#layout">
<region xml:id="bottom" tts:displayAlign="after" tts:extent="80% 40%" tts:origin="10% 50%"/>
</layout>
</head>
<body region="bottom" style="s1">
<div>
<p begin="00:00:00.000" end="00:00:02.850" style="s1" region="bottom">the quick brown fox jumps over the</p>
<p begin="00:00:02.850" end="00:00:03.570" style="s1" region="bottom">lazy dog</p>
</div>
</body>
</tt>
Subtitle Format Examples
SBV
00:00:00.000 --> 00:00:02.850
the quick brown fox jumps over the
00:00:02.850 --> 00:00:03.570
lazy dog
SRT
srt
Copy
Edit
1
00:00:00.000 --> 00:00:02.850
the quick brown fox jumps over the
2
00:00:02.850 --> 00:00:03.570
lazy dog
VTT
vtt
Copy
Edit
WEBVTT
NOTE This file has been generated by Scriptix
00:00.000 --> 00:02.850
the quick brown fox jumps over the
00:02.850 --> 00:03.570
lazy dog