Skip to main content

Retrieve Results

Get a completed transcript or caption document.

Endpoint

GET /api/v3/speech-to-text/session/{session_id}/document/{document_id}

Path Parameters

ParameterTypeDescription
session_idstringSession ID
document_idstringDocument ID

Query Parameters

ParameterTypeDescription
formatstringExport format (optional)
template_idstringExport template ID (optional)

Response Type

Response Format: BaseResponse<DocumentObject>

{
"count": 1,
"total_results": 1,
"result": {
"id": "document_id",
"filename": "example.mp4",
"type": "document",
"language": "en",
"created": "2025-01-15T10:30:00Z",
"last_modified": "2025-01-15T10:45:00Z",
"finished": true,
"content_type": "application/json",
"document": {
"document_type": "document",
"version": "1.0",
"document": []
}
}
}

Document Object Fields

FieldTypeDescription
idstringDocument ID (readonly)
filenamestringOriginal filename (readonly)
typestringDocument type
languagestringLanguage code
createdDateCreation timestamp (readonly)
last_modifiedDateLast modification timestamp (readonly)
finishedbooleanWhether document is marked as finished
timecode_offsetstringTimecode offset (optional)
documentobjectDocument content (null when direct_download=true)
download_urlstring | nullDirect download URL (when direct_download=true)
expires_innumber | nullURL expiration in seconds (when direct_download=true)
content_typestringContent type
size_bytesnumber | nullFile size in bytes

Document Structure

For transcript documents (document_type: "document"):

The document field contains a DocumentTranscriptV1 object with:

  • document_type: "document"
  • version: "1.0"
  • document: Array of utterances

Utterance Structure:

  • type: "utterance"
  • speaker: Speaker identifier (if diarization enabled)
  • start: Start time in milliseconds
  • stop: Stop time in milliseconds
  • children: Array of paragraphs

Paragraph Structure:

  • type: "paragraph"
  • start: Start time in milliseconds
  • stop: Stop time in milliseconds
  • children: Array of text objects

Text Object Structure:

  • text: Text content
  • start: Start time in milliseconds
  • stop: Stop time in milliseconds
  • bold: Boolean (optional)
  • italic: Boolean (optional)
  • underlined: Boolean (optional)
  • strike: Boolean (optional)
  • color: String (optional)
  • mark: Boolean (optional)

Shared Document Endpoint

For shared documents (token-based access without authentication):

GET /api/v3/speech-to-text/session/{session_id}/shared/document/{document_id}

See Documents API.