Skip to main content

Document Management Overview

Documents provide a flexible way to view, edit, and manage the results of transcription sessions. They are designed to give you more control than static result downloads, especially for use cases like captioning, editing, and reviewing transcripts.


๐Ÿงพ What Is a Document?โ€‹

A document is a structured, segment-based representation of a transcription result. Unlike raw session results, documents are designed to support modification and post-processing.

We currently support two types of documents:

Document TypeDescription
TranscriptPlain transcript, optimized for readability and review
CaptionTimed segments (e.g. SRT/VTT) optimized for subtitling and video display

๐Ÿ› ๏ธ We offer an online caption editor today, with a visual document editor coming soon.


๐Ÿ“š Relation to Transcript Sessionsโ€‹

Documents are always associated with a transcript session. A session may have zero or more documents derived from its results.

  • A document is created after a batch session completes.
  • At this time, real-time transcription sessions do not generate documents, since no data is stored post-session.
  • This behavior may change as we roll out recording and retention support.

๐Ÿ“ Why Use Documents?โ€‹

Unlike downloaded results, documents are:

  • ๐Ÿงฉ Segmented โ€“ Organized into human-editable blocks
  • โœ๏ธ Modifiable โ€“ Designed for editing and collaboration
  • ๐Ÿ—ƒ๏ธ Persistent โ€“ Stored and accessible via the API
  • ๐Ÿงญ Version-aware โ€“ Easily integrated into review workflows

Common Use Casesโ€‹

  • Subtitling (SRT/VTT)
  • Transcription review and editing
  • Translation workflows
  • Platform integrations (e.g. embedding captioned videos)

๐Ÿ“ฆ How to Useโ€‹

To manage documents via the API, refer to:


๐Ÿšซ Limitationsโ€‹

  • Documents can only be created from completed batch sessions
  • Not available for real-time sessions
  • Only supported for sessions where transcription results are retained

๐Ÿ› ๏ธ We plan to expand this functionality with editing, translations, and audio-linked visual editing in upcoming releases.


๐Ÿ”œ Next Stepsโ€‹