Getting Started with Transcription

This guide covers the essentials of creating transcripts from audio and video files.

What is Transcription?

Transcription converts spoken audio into written text using speech recognition technology, providing:

Timestamped text - Words linked to their position in the audio
Speaker identification - Automatic detection and labeling of different speakers (diarization)
Editable transcripts - Powerful editor for reviewing and refining
Multiple export formats - Various formats for different use cases
Collaboration tools - Sharing and review workflows

Basic Workflow

Upload - Upload media files or record directly
Configure - Select language and transcription options
Process - Automatic speech-to-text conversion
Edit - Review and refine in the transcript editor
Export - Download in your preferred format

Quick Start

Step 1: Upload Your File

Click Transcript or Create
Choose your upload method:
- Drag and drop a file into the upload area
- Click to browse and select from your computer
- Record using webcam, microphone, or screen capture
- OneDrive - Import from Microsoft OneDrive
- URL - Import from a public media URL

Supported formats:

Audio: MP3, WAV, M4A, AAC, FLAC, OGG, WMA, AIFF
Video: MP4, MOV, AVI, MKV, WebM, M4V, 3GP, FLV, WMV, TS

Step 2: Configure Settings

Language (Required)

Select the spoken language from the dropdown
Accurate language selection improves results

Diarization (Optional)

Enable to automatically identify different speakers
Speakers will be labeled as "Speaker 1", "Speaker 2", etc.
You can rename speakers in the editor

Additional Options

Keep source - Retain uploaded media file
Folder - Organize into a specific folder

Step 3: Start Transcription

Click Upload
Your file enters the processing queue
Monitor progress in the "Workspace" section

Step 4: Review & Edit

Once processing completes:

Navigate to your completed session
Click to open the transcript editor
The editor displays:
- Audio/video player
- Editable transcript text
- Speaker labels (if diarization was enabled)
- Timestamps
Edit the transcript:
- Click any text to edit
- Click timestamps to jump in audio
- Rename speakers by clicking their labels
- Changes auto-save

Step 5: Export

When ready:

Click Export button
Choose your format:
- DOCX - Microsoft Word
- TXT - Plain text
- PDF - Non-editable document
- JSON - Structured data
- CSV - Spreadsheet format
- HTML - Web format
Configure export options (timestamps, speaker labels, etc.)
Download your transcript

Understanding File Uploads

File Size Limits

Upload files up to 40 GB (audio or video).

Note: Depending on your plan, different limits may apply. Contact your administrator if you need to upload larger files.

How Uploads Work

Scriptix uses smart upload technology:

Large files - Automatically broken into smaller chunks and uploaded piece by piece
Interrupted uploads - Can resume from where they left off if your connection drops
Progress tracking - See real-time progress as your file uploads
Reliable delivery - Automatic retries ensure your files arrive safely

Transcription Settings Explained

Language Selection

Selecting the correct language is crucial for accuracy.

How to choose:

Select the language dropdown during upload
Choose the language that matches the spoken audio
If you're unsure, check the available languages in the dropdown menu

Tip: The more accurately you select the language, the better your transcription results will be.

Speaker Diarization

Diarization automatically identifies and separates different speakers in your audio.

How it works:

Enable "diarization" checkbox during upload
The system analyzes voice characteristics
Speakers are labeled sequentially (Speaker 1, Speaker 2, etc.)
You can rename speakers in the editor

When to use:

Interviews and conversations
Panel discussions
Meetings with multiple participants
Podcasts with hosts and guests

Best results:

Clear audio with distinct voices
Minimal speaker overlap
Good microphone quality

Document Types

When creating a session, you specify:

document - For standard transcripts
caption - For subtitle/caption files

Understanding Processing

Processing Status

Track your transcription through these stages:

Uploading - File transfer in progress
Queued - Waiting for processing
Processing - Transcription in progress
Finished - Ready for editing
Failed - Error occurred (check error message)

What to Expect

Processing Time: Actual processing time varies based on:

File duration
Audio quality
System load
Selected options (diarization adds time)

Accuracy: Transcription accuracy depends on:

Audio quality (most important factor)
Speaker clarity and accent
Background noise levels
Technical terminology (consider custom models)

Tips for Better Results

Before Recording

Use quality equipment - Good microphone improves results significantly
Choose quiet location - Minimize background noise
Test audio levels - Avoid clipping and ensure adequate volume

During Recording

Speak clearly - Moderate pace with clear enunciation
Reduce crosstalk - Minimize overlapping speech
Control environment - Turn off fans, close windows

After Recording

Select correct language - Match the spoken language
Enable diarization - For multi-speaker content
Review carefully - Always review and edit transcripts

Common Use Cases

Interviews & Podcasts

Enable speaker diarization
Export as DOCX for editing
Use timestamps for show notes

Meetings & Conferences

Diarization for multiple speakers
JSON export for data analysis
Share via magic links for team review

Video Content

Upload video files directly (audio extracted)
Create captions from transcript
Export SRT/VTT for video platforms

Academic & Research

High-quality audio recommended
Export as DOCX or PDF
Use timestamps for citations

Troubleshooting

Upload Issues

Verify file format is supported
Check file size is within limits (40 GB default)
Ensure stable internet connection
Large files automatically use TUS protocol for resumable uploads

Processing Issues

Check session status for error messages
Verify audio quality is adequate
Ensure correct language selected

Quality Issues

Improve audio quality at source
Enable diarization for multi-speaker
Consider custom models for specialized content

See Troubleshooting for comprehensive solutions.

Next Steps

Transcript Editor - Master the editing interface
Speaker Management - Work with speaker diarization
Export Options - Export formats and settings
Supported Formats - Complete format list

System Requirements - Browser and technical requirements
FAQ - Frequently asked questions

What is Transcription?​

Basic Workflow​

Quick Start​

Step 1: Upload Your File​

Step 2: Configure Settings​

Step 3: Start Transcription​

Step 4: Review & Edit​

Step 5: Export​

Understanding File Uploads​

File Size Limits​

How Uploads Work​

Transcription Settings Explained​

Language Selection​

Speaker Diarization​

Document Types​

Understanding Processing​

Processing Status​

What to Expect​

Tips for Better Results​

Before Recording​

During Recording​

After Recording​

Common Use Cases​

Interviews & Podcasts​

Meetings & Conferences​

Video Content​

Academic & Research​

Troubleshooting​

Upload Issues​

Processing Issues​

Quality Issues​

Next Steps​

Related Documentation​