Retrieve a document

Use correct API token

It is important that the correct API token is used for document retrieval. Only the token linked to the TranscriptSession can be used.

IMPORTANT: Recommended Download Method

Streaming Responses Deprecated

Direct streaming of document content is deprecated. For better performance and reliability, use the URL-based download method by setting direct_download=true.

Why switch?

Reduces API server load
Uses pre-generated files when available (faster)
Enables browser-native download features
Supports larger files more reliably

See URL Download Method below for details.

URL

GET https://api.scriptix.io/api/v3/speech-to-text/session/${sessionId}/document/${documentId}

Request headers

The following headers need to be present:

Parameter	Value	Description
X-Zoom-S2T-Key	Scriptix Batch API Token	API key belonging to TranscriptSession

Request path arguments

Argument	Description
sessionId	Scriptix Transcript Session ID returned from a Batch Session
documentId	Scriptix Document ID returned from creating a document

Request query parameters

Key	Type	Default	Description
format	string	`json`	Returns the document in the specified format. See Supported Document Formats.
direct_download	boolean	`false`	RECOMMENDED: Set to `true` When `true`, returns a time-limited download URL instead of embedded content. See URL Download Method.

Unified Response

Both download methods (direct_download=true and direct_download=false) now return the same unified DocumentDownloadUrlResponse structure. The response always includes document metadata, with conditional fields based on the download method selected.

Supported Document Formats

The following formats are supported when retrieving a document using the format query parameter:

Format	Description
`json`	Scriptix ExtendedDocumentModel JSON response
`sbv`	SBV subtitle format
`srt`	SRT subtitle format
`ttml`	TTML subtitle format
`vtt`	VTT subtitle format
`docx`	Microsoft Word document
`template`	AI generated template (either with a specific format or summarization)
`stl`	Subtitle format for broadcast
`html`	HTML document

Response Codes

Status code	Description
200	Document in requested format
400	Bad request
401	Unauthorized — no valid authentication found
403	Forbidden — access to resource is not allowed
404	Not found — the transcript session is not found or doesn't match the API token

Response Methods

There are two ways to retrieve document content:

URL Download (RECOMMENDED) - Returns a time-limited download URL
Direct Streaming (DEPRECATED) - Streams content directly through API

URL Download Method (RECOMMENDED)

When to Use

Recommended for all production use cases
Best for files that will be downloaded by browsers
Optimal performance with pre-generated files
Reduced server load and faster response times

Request Example

GET /api/v3/speech-to-text/session/{sessionId}/document/{documentId}?format=pdf&direct_download=true
X-Zoom-S2T-Key: YOUR_BATCH_API_TOKEN

Response: HTTP 200 OK

Content-Type: application/json

Response Format: APIResultResponse<DocumentDownloadUrlResponse>

{
  "result": {
    "download_url": "https://scriptixbox.blob.core.windows.net/documents/abc123/transcript.pdf?sv=2021-08-06&se=2025-12-07T15:30:00Z&sr=b&sp=r&sig=...",
    "expires_in": 3600,
    "filename": "transcript.pdf",
    "content_type": "application/pdf",
    "size_bytes": null,
    "document": null,
    "created": "2024-01-15T10:30:00Z",
    "language": "en",
    "id": "123e4567-e89b-12d3-a456-426614174000",
    "last_modified": "2024-01-15T11:00:00Z",
    "type": "document",
    "timecode_offset": "00:00:00.000",
    "finished": false,
    "use_plain_document": false,
    "plain_document_changed": false
  },
  "count": 0,
  "total_results": 0
}

Response Schema

Field	Type	Description
URL Fields (when `direct_download=true`)
download_url	string \| null	Time-limited Azure SAS URL for downloading the file (null when `direct_download=false`)
expires_in	integer \| null	URL expiration time in seconds, typically 3600 = 1 hour (null when `direct_download=false`)
Document Field (when `direct_download=false`)
document	object \| null	Embedded document content (null when `direct_download=true`)
Common Fields (always present)
filename	string	Original filename with proper extension
content_type	string	MIME type of the document
size_bytes	integer \| null	File size in bytes
Metadata Fields (always present)
id	string	Document unique identifier
created	string	ISO 8601 timestamp of document creation
last_modified	string	ISO 8601 timestamp of last modification
language	string \| null	Document language code (e.g., "en")
type	string	Document type: "document" or "caption"
timecode_offset	string \| null	Timecode offset in format "HH:MM:SS.mmm"
finished	boolean \| null	Whether document processing is complete
use_plain_document	boolean \| null	Whether plain document format is used
plain_document_changed	boolean \| null	Whether plain document has been modified

URL Expiration

Download URLs expire after 1 hour (3600 seconds). If the URL expires, make a new API request to get a fresh URL. The actual document file remains available.

Implementation Example

Using Direct Download (Recommended)

const response = await fetch(
  'https://api.scriptix.io/api/v3/speech-to-text/session/SESSION_ID/document/DOC_ID?format=pdf&direct_download=true',
  {
    headers: {
      'X-Zoom-S2T-Key': 'YOUR_API_TOKEN'
    }
  }
);

const { result } = await response.json();

// Access download URL (populated when direct_download=true)
if (result.download_url) {
  console.log('Download URL:', result.download_url);
  console.log('Expires in:', result.expires_in, 'seconds');

  // Access metadata (always available in unified response)
  console.log('Document ID:', result.id);
  console.log('Created:', result.created);
  console.log('Language:', result.language);
  console.log('Filename:', result.filename);

  // Download the file
  window.location.href = result.download_url;

  // Or fetch and process
  const fileResponse = await fetch(result.download_url);
  const blob = await fileResponse.blob();
  // Process blob...
}

Using Embedded Document (Backward Compatible)

const response = await fetch(
  'https://api.scriptix.io/api/v3/speech-to-text/session/SESSION_ID/document/DOC_ID?format=json',
  {
    headers: {
      'X-Zoom-S2T-Key': 'YOUR_API_TOKEN'
    }
  }
);

const { result } = await response.json();

// Access embedded document (populated when direct_download=false)
if (result.document) {
  console.log('Document content:', result.document);

  // Access metadata (now also available with embedded documents)
  console.log('Document ID:', result.id);
  console.log('Created:', result.created);
  console.log('Language:', result.language);
  console.log('Processing finished:', result.finished);
}

Direct Streaming Method (DEPRECATED)

Deprecated Method

This method is deprecated and maintained only for backwards compatibility. Use the URL Download Method instead.

When Used

Only when direct_download=false (default for backwards compatibility)
Not recommended for new implementations

Request Example

GET /api/v3/speech-to-text/session/{sessionId}/document/{documentId}?format=pdf
X-Zoom-S2T-Key: YOUR_BATCH_API_TOKEN

Response: HTTP 200 OK (Embedded Document)

When direct_download=false, the response uses the unified structure with embedded document content:

Content-Type: application/json

Response Format: APIResultResponse<DocumentDownloadUrlResponse>

{
  "result": {
    "document": [
      {
        "start": 525,
        "stop": 6252,
        "speaker": "M1",
        "text": "This is just an example.\non two lines."
      }
    ],
    "filename": "Example document.json",
    "content_type": "application/json",
    "size_bytes": null,
    "download_url": null,
    "expires_in": null,
    "id": "12780b54-5575-44e4-bbca-0a410b432183",
    "created": "2021-01-25T10:31:46.960411+00:00",
    "last_modified": "2021-01-25T10:31:46.960411+00:00",
    "language": "en",
    "type": "caption",
    "timecode_offset": "00:00:00.000",
    "finished": true,
    "use_plain_document": false,
    "plain_document_changed": false
  },
  "count": 0,
  "total_results": 0
}

Backward Compatibility

Existing code that accesses data.result.document continues to work. The unified response now also includes full metadata fields that were previously unavailable in the embedded mode.

Response: HTTP 200 OK (Other Formats)

For all non-JSON formats (pdf, docx, srt, vtt, etc.), the response streams binary content:

Content-Type: Varies by format (e.g., application/pdf, text/plain, application/vnd.openxmlformats-officedocument.wordprocessingml.document)

Response Body: Raw file content (binary stream)

Headers:

Content-Disposition: attachment; filename="document.{ext}"
Content-Type: {mime-type}

Performance Impact

Streaming large files directly through the API can:

Increase server load
Result in slower downloads
Miss opportunities for CDN caching
Require keeping HTTP connections open longer

Format Examples

The following examples show the content structure of various document formats. These examples represent the file content you'll receive either:

Via the URL returned when using direct_download=true (recommended)
Through direct streaming when using direct_download=false (deprecated)

TTML Example Output

Below is an example of a TTML subtitle document generated by Scriptix:

<?xml version="1.0" encoding="UTF-8"?>
<tt xmlns="http://www.w3.org/ns/ttml" xml:lang="en-us">
  <head>
    <metadata xmlns:ttm="http://www.w3.org/ns/ttml#metadata">
      <ttm:title>Scriptix TTML</ttm:title>
    </metadata>
    <styling xmlns:tts="http://www.w3.org/ns/ttml#styling">
      <style xml:id="s1" tts:textAlign="center" tts:fontFamily="Arial" tts:fontSize="100%"/>
    </styling>
    <layout xmlns:tts="http://www.w3.org/ns/ttml#layout">
      <region xml:id="bottom" tts:displayAlign="after" tts:extent="80% 40%" tts:origin="10% 50%"/>
    </layout>
  </head>
  <body region="bottom" style="s1">
    <div>
      <p begin="00:00:00.000" end="00:00:02.850" style="s1" region="bottom">the quick brown fox jumps over the</p>
      <p begin="00:00:02.850" end="00:00:03.570" style="s1" region="bottom">lazy dog</p>
    </div>
  </body>
</tt>

Subtitle Format Examples

SBV

00:00:00.000 --> 00:00:02.850
the quick brown fox jumps over the

00:00:02.850 --> 00:00:03.570
lazy dog

SRT

srt
Copy
Edit
1
00:00:00.000 --> 00:00:02.850
the quick brown fox jumps over the

2
00:00:02.850 --> 00:00:03.570
lazy dog

VTT

vtt
Copy
Edit
WEBVTT

NOTE This file has been generated by Scriptix

00:00.000 --> 00:02.850
the quick brown fox jumps over the

00:02.850 --> 00:03.570
lazy dog

Use correct API token​

IMPORTANT: Recommended Download Method​

URL​

Request headers​

Request path arguments​

Request query parameters​

Supported Document Formats​

Response Codes​

Response Methods

URL Download Method (RECOMMENDED)​

When to Use​

Request Example​

Response: HTTP 200 OK​

Response Schema​

Implementation Example​

Direct Streaming Method (DEPRECATED)​

When Used​

Request Example​

Response: HTTP 200 OK (Embedded Document)​

Response: HTTP 200 OK (Other Formats)​

Format Examples

TTML Example Output​

Subtitle Format Examples

SBV​

SRT​

VTT​

Use correct API token

IMPORTANT: Recommended Download Method

URL

Request headers

Request path arguments

Request query parameters

Supported Document Formats

Response Codes

URL Download Method (RECOMMENDED)

When to Use

Request Example

Response: HTTP 200 OK

Response Schema

Implementation Example

Direct Streaming Method (DEPRECATED)

When Used

Request Example

Response: HTTP 200 OK (Embedded Document)

Response: HTTP 200 OK (Other Formats)

TTML Example Output

SBV

SRT

VTT