Skip to main content

Getting Started with Transcription

Welcome to Scriptix transcription! This guide covers the essentials of creating transcripts from audio and video files.

What is Transcription?

Transcription converts spoken audio into written text. Scriptix uses speech recognition technology to automatically transcribe your media files, providing:

  • Timestamped text - Words linked to their position in the audio
  • Speaker identification - Automatic detection and labeling of different speakers (diarization)
  • Editable transcripts - Powerful editor for reviewing and refining
  • Multiple export formats - Various formats for different use cases
  • Collaboration tools - Sharing and review workflows

Basic Workflow

Creating a transcript follows these steps:

  1. Upload - Upload media files or record directly
  2. Configure - Select language and transcription options
  3. Process - Automatic speech-to-text conversion
  4. Edit - Review and refine in the transcript editor
  5. Export - Download in your preferred format

Quick Start

Step 1: Upload Your File

From the workspace:

  1. Click Transcript or Create
  2. Choose your upload method:
    • Drag and drop a file into the upload area
    • Click to browse and select from your computer
    • Record using webcam, microphone, or screen capture
    • Cloud storage - Import from Google Drive, Dropbox, OneDrive, Box, or Zoom
    • URL - Import from a public media URL

Supported formats:

  • Audio: MP3, WAV, M4A, AAC, FLAC, OGG, WMA, AIFF
  • Video: MP4, MOV, AVI, MKV, WebM, M4V, 3GP, FLV, WMV, TS

See Supported Formats for the complete list.

Step 2: Configure Settings

Configure your transcription:

Language (Required)

  • Select the spoken language from the dropdown
  • Accurate language selection improves results

Diarization (Optional)

  • Enable to automatically identify different speakers
  • Speakers will be labeled as "Speaker 1", "Speaker 2", etc.
  • You can rename speakers in the editor

Additional Options

  • Keep source - Retain uploaded media file
  • Punctuation - Enable automatic punctuation
  • Multichannel - Process multi-channel audio separately
  • Folder - Organize into a specific folder

Step 3: Start Transcription

  1. Click Upload
  2. Your file enters the processing queue
  3. Monitor progress in the "Workspace" section

Step 4: Review & Edit

Once processing completes:

  1. Navigate to your completed session

  2. Click to open the transcript editor

  3. The editor displays:

    • Audio/video player
    • Editable transcript text
    • Speaker labels (if diarization was enabled)
    • Timestamps
  4. Edit the transcript:

    • Click any text to edit
    • Click timestamps to jump in audio
    • Rename speakers by clicking their labels
    • Changes auto-save

See Transcript Editor for detailed editing instructions.

Step 5: Export

When ready:

  1. Click Export button
  2. Choose your format:
    • DOCX - Microsoft Word
    • TXT - Plain text
    • PDF - Non-editable document
    • JSON - Structured data
    • CSV - Spreadsheet format
    • HTML - Web format
  3. Configure export options (timestamps, speaker labels, etc.)
  4. Download your transcript

See Export Transcripts for detailed export options.

Understanding File Uploads

File Size Limits

Maximum file size is configured per environment:

  • Production: 20 GB
  • Staging: 10 GB

Note: Plan-specific limits may apply. Check your subscription details or contact your administrator.

Upload Protocols

Scriptix uses:

  • Standard HTTP upload - For most files
  • TUS protocol - Resumable uploads for large files
  • Chunked upload - Automatic for large files

Transcription Settings Explained

Language Selection

Selecting the correct language is crucial for accuracy.

Transcription Languages: Scriptix supports 37+ languages for speech-to-text transcription, including:

European Languages: English (en), Dutch (nl), French (fr), German (de), Spanish (es), Italian (it), Portuguese (pt), Polish (pl), Czech (cs), Danish (da), Finnish (fi), Greek (el), Hungarian (hu), Norwegian (no), Romanian (ro), Slovak (sk), Swedish (sv), Russian (ru), Ukrainian (uk)

Asian Languages: Chinese (zh), Japanese (ja), Korean (ko), Hindi (hi), Thai (th), Indonesian (id), Vietnamese (vi), Malay (ms), Tamil (ta), Telugu (te)

Middle Eastern Languages: Arabic (ar), Hebrew (he), Turkish (tr), Persian/Farsi (fa)

Note: Transcription language support is determined by the backend API. Consult API documentation for the complete list of supported transcription languages.

Speaker Diarization

What is it? Diarization automatically identifies and separates different speakers in your audio.

How it works:

  1. Enable "diarization" checkbox during upload
  2. The system analyzes voice characteristics
  3. Speakers are labeled sequentially (Speaker 1, Speaker 2, etc.)
  4. You can rename speakers in the editor

When to use:

  • Interviews and conversations
  • Panel discussions
  • Meetings with multiple participants
  • Podcasts with hosts and guests

Best results:

  • Clear audio with distinct voices
  • Minimal speaker overlap
  • Good microphone quality

Document Types

When creating a session, you specify:

  • transcript - For standard transcripts
  • caption - For subtitle/caption files

Understanding Processing

Processing Status

Track your transcription through these stages:

  1. Uploading - File transfer in progress
  2. Queued - Waiting for processing
  3. Processing - Transcription in progress
  4. Completed - Ready for editing
  5. Failed - Error occurred (check error message)

What to Expect

Processing Time: Actual processing time varies based on:

  • File duration
  • Audio quality
  • System load
  • Selected options (diarization adds time)

Accuracy: Transcription accuracy depends on:

  • Audio quality (most important factor)
  • Speaker clarity and accent
  • Background noise levels
  • Technical terminology (consider custom models)

Tips for Better Results

Before Recording

  1. Use quality equipment - Good microphone improves results significantly
  2. Choose quiet location - Minimize background noise
  3. Test audio levels - Avoid clipping and ensure adequate volume

During Recording

  1. Speak clearly - Moderate pace with clear enunciation
  2. Reduce crosstalk - Minimize overlapping speech
  3. Control environment - Turn off fans, close windows

After Recording

  1. Select correct language - Match the spoken language
  2. Enable diarization - For multi-speaker content
  3. Review carefully - Always review and edit transcripts

Common Use Cases

Interviews & Podcasts

  • Enable speaker diarization
  • Export as DOCX for editing
  • Use timestamps for show notes

Meetings & Conferences

  • Diarization for multiple speakers
  • JSON export for data analysis
  • Share via magic links for team review

Video Content

  • Upload video files directly (audio extracted)
  • Create captions from transcript
  • Export SRT/VTT for video platforms

Academic & Research

  • High-quality audio recommended
  • Export as DOCX or PDF
  • Use timestamps for citations

Troubleshooting

Upload Issues

  • Verify file format is supported
  • Check file size within limits
  • Ensure stable internet connection
  • Try TUS upload for large files

See Troubleshooting > Upload Issues

Processing Issues

  • Check session status for error messages
  • Verify audio quality is adequate
  • Ensure correct language selected

Quality Issues

  • Improve audio quality at source
  • Enable diarization for multi-speaker
  • Consider custom models for specialized content

See Troubleshooting for comprehensive solutions.

Next Steps

Learn more about specific topics:


Ready to start? Continue to Creating Transcripts for detailed instructions.