Getting Started with Transcription
Welcome to Scriptix transcription! This guide covers the essentials of creating transcripts from audio and video files.
What is Transcription?
Transcription converts spoken audio into written text. Scriptix uses speech recognition technology to automatically transcribe your media files, providing:
- Timestamped text - Words linked to their position in the audio
- Speaker identification - Automatic detection and labeling of different speakers (diarization)
- Editable transcripts - Powerful editor for reviewing and refining
- Multiple export formats - Various formats for different use cases
- Collaboration tools - Sharing and review workflows
Basic Workflow
Creating a transcript follows these steps:
- Upload - Upload media files or record directly
- Configure - Select language and transcription options
- Process - Automatic speech-to-text conversion
- Edit - Review and refine in the transcript editor
- Export - Download in your preferred format
Quick Start
Step 1: Upload Your File
From the workspace:
- Click Transcript or Create
- Choose your upload method:
- Drag and drop a file into the upload area
- Click to browse and select from your computer
- Record using webcam, microphone, or screen capture
- Cloud storage - Import from Google Drive, Dropbox, OneDrive, Box, or Zoom
- URL - Import from a public media URL
Supported formats:
- Audio: MP3, WAV, M4A, AAC, FLAC, OGG, WMA, AIFF
- Video: MP4, MOV, AVI, MKV, WebM, M4V, 3GP, FLV, WMV, TS
See Supported Formats for the complete list.
Step 2: Configure Settings
Configure your transcription:
Language (Required)
- Select the spoken language from the dropdown
- Accurate language selection improves results
Diarization (Optional)
- Enable to automatically identify different speakers
- Speakers will be labeled as "Speaker 1", "Speaker 2", etc.
- You can rename speakers in the editor
Additional Options
- Keep source - Retain uploaded media file
- Punctuation - Enable automatic punctuation
- Multichannel - Process multi-channel audio separately
- Folder - Organize into a specific folder
Step 3: Start Transcription
- Click Upload
- Your file enters the processing queue
- Monitor progress in the "Workspace" section
Step 4: Review & Edit
Once processing completes:
-
Navigate to your completed session
-
Click to open the transcript editor
-
The editor displays:
- Audio/video player
- Editable transcript text
- Speaker labels (if diarization was enabled)
- Timestamps
-
Edit the transcript:
- Click any text to edit
- Click timestamps to jump in audio
- Rename speakers by clicking their labels
- Changes auto-save
See Transcript Editor for detailed editing instructions.
Step 5: Export
When ready:
- Click Export button
- Choose your format:
- DOCX - Microsoft Word
- TXT - Plain text
- PDF - Non-editable document
- JSON - Structured data
- CSV - Spreadsheet format
- HTML - Web format
- Configure export options (timestamps, speaker labels, etc.)
- Download your transcript
See Export Transcripts for detailed export options.
Understanding File Uploads
File Size Limits
Maximum file size is configured per environment:
- Production: 20 GB
- Staging: 10 GB
Note: Plan-specific limits may apply. Check your subscription details or contact your administrator.
Upload Protocols
Scriptix uses:
- Standard HTTP upload - For most files
- TUS protocol - Resumable uploads for large files
- Chunked upload - Automatic for large files
Transcription Settings Explained
Language Selection
Selecting the correct language is crucial for accuracy.
Transcription Languages: Scriptix supports 37+ languages for speech-to-text transcription, including:
European Languages: English (en), Dutch (nl), French (fr), German (de), Spanish (es), Italian (it), Portuguese (pt), Polish (pl), Czech (cs), Danish (da), Finnish (fi), Greek (el), Hungarian (hu), Norwegian (no), Romanian (ro), Slovak (sk), Swedish (sv), Russian (ru), Ukrainian (uk)
Asian Languages: Chinese (zh), Japanese (ja), Korean (ko), Hindi (hi), Thai (th), Indonesian (id), Vietnamese (vi), Malay (ms), Tamil (ta), Telugu (te)
Middle Eastern Languages: Arabic (ar), Hebrew (he), Turkish (tr), Persian/Farsi (fa)
Note: Transcription language support is determined by the backend API. Consult API documentation for the complete list of supported transcription languages.
Speaker Diarization
What is it? Diarization automatically identifies and separates different speakers in your audio.
How it works:
- Enable "diarization" checkbox during upload
- The system analyzes voice characteristics
- Speakers are labeled sequentially (Speaker 1, Speaker 2, etc.)
- You can rename speakers in the editor
When to use:
- Interviews and conversations
- Panel discussions
- Meetings with multiple participants
- Podcasts with hosts and guests
Best results:
- Clear audio with distinct voices
- Minimal speaker overlap
- Good microphone quality
Document Types
When creating a session, you specify:
- transcript - For standard transcripts
- caption - For subtitle/caption files
Understanding Processing
Processing Status
Track your transcription through these stages:
- Uploading - File transfer in progress
- Queued - Waiting for processing
- Processing - Transcription in progress
- Completed - Ready for editing
- Failed - Error occurred (check error message)
What to Expect
Processing Time: Actual processing time varies based on:
- File duration
- Audio quality
- System load
- Selected options (diarization adds time)
Accuracy: Transcription accuracy depends on:
- Audio quality (most important factor)
- Speaker clarity and accent
- Background noise levels
- Technical terminology (consider custom models)
Tips for Better Results
Before Recording
- Use quality equipment - Good microphone improves results significantly
- Choose quiet location - Minimize background noise
- Test audio levels - Avoid clipping and ensure adequate volume
During Recording
- Speak clearly - Moderate pace with clear enunciation
- Reduce crosstalk - Minimize overlapping speech
- Control environment - Turn off fans, close windows
After Recording
- Select correct language - Match the spoken language
- Enable diarization - For multi-speaker content
- Review carefully - Always review and edit transcripts
Common Use Cases
Interviews & Podcasts
- Enable speaker diarization
- Export as DOCX for editing
- Use timestamps for show notes
Meetings & Conferences
- Diarization for multiple speakers
- JSON export for data analysis
- Share via magic links for team review
Video Content
- Upload video files directly (audio extracted)
- Create captions from transcript
- Export SRT/VTT for video platforms
Academic & Research
- High-quality audio recommended
- Export as DOCX or PDF
- Use timestamps for citations
Troubleshooting
Upload Issues
- Verify file format is supported
- Check file size within limits
- Ensure stable internet connection
- Try TUS upload for large files
See Troubleshooting > Upload Issues
Processing Issues
- Check session status for error messages
- Verify audio quality is adequate
- Ensure correct language selected
Quality Issues
- Improve audio quality at source
- Enable diarization for multi-speaker
- Consider custom models for specialized content
See Troubleshooting for comprehensive solutions.
Next Steps
Learn more about specific topics:
- Creating Transcripts - Detailed upload and configuration
- Transcript Editor - Master the editing interface
- Speaker Management - Work with speaker diarization
- Export Options - Export formats and settings
- Supported Formats - Complete format list
Related Documentation
- System Requirements - Browser and technical requirements
- FAQ - Frequently asked questions
Ready to start? Continue to Creating Transcripts for detailed instructions.