When to Use Custom Models
Custom models in Scriptix allow you to train specialized speech recognition models with your own data.
What Are Custom Models?
Custom models are speech recognition models trained on your specific audio and transcript data. They are built on top of base language models.
Custom Models vs Glossaries
Scriptix provides two features for improving transcription accuracy:
Custom Models
What They Are:
- Trained language models using your audio and transcripts
- Built on base language model
- Require training process
- Use training credits
Requirements:
- Audio and/or transcript files for training
- Training credits in organization
- Datasets uploaded (1-10 files per upload, max 10GB per file)
- Training time
How to Create:
- Navigate to Custom Models page
- Click "New Custom Model"
- Enter name and select base language
- Upload datasets (audio/transcripts)
- Train the model
- Use after training succeeds
Glossaries
What They Are:
- Term pair definitions (source → target language)
- Direct term replacement in transcripts
- No training required
Requirements:
- Source and target language selection
- Term pairs list (CSV format)
- Immediate availability
How to Create:
- Navigate to Glossaries page
- Click "Create Glossary"
- Enter name and description
- Select source and target language
- Add term pairs
- Use immediately in transcriptions
Using Both Together
Custom models and glossaries can be used in the same transcription:
Custom Models:
- Improve base speech recognition
- Train on audio patterns and vocabulary
- Apply at transcription level
Glossaries:
- Replace specific terms
- Handle exact term mappings
- Apply during or after transcription
Combined Benefit:
- Custom model improves general recognition
- Glossary handles specific term replacements
- Both work together for best accuracy
Base Language Models
Custom models build on base language models:
Language Selection:
- Choose from trainable languages
- Each language has base model
- Custom model extends base model
Language Properties:
- Language has ID, key, name
- BCP47 code
- Trainable flag (
is_trainable: true) - Public or private
Trainable Languages:
- Only languages with
is_trainable: trueavailable - Selected during custom model creation
- Cannot change after creation
When to Create Custom Models
Create custom models when:
- You have audio and transcript training data
- You want to improve recognition for specific vocabulary
- You have training credits available
- You need ongoing transcription with specialized terms
When to Use Glossaries
Use glossaries when:
- You have specific term pairs to replace
- You need immediate solution
- You don't have audio training data
- You want simple term replacements
Training Requirements
To train a custom model:
Data Requirements:
- Audio files (.wav, .mp3, .m4a, .flac) OR
- Transcript files (.vtt, .srt, .txt) OR
- Manifest files (.jsonl) OR
- Combination of above
- 1-10 files per upload
- Maximum 10GB per file
Organization Requirements:
- Training credits available
- Custom model feature access
Process:
- Create custom model
- Upload datasets
- Validation runs
- Training starts
- Wait for completion (status changes to "Success")
- Use model in transcriptions
Force Alignment
Use Force Alignment to prepare training data:
What It Does:
- Adds timestamps to plain text transcripts
- Uses audio + existing transcript
- Creates properly formatted training files
When to Use:
- You have audio files
- You have plain text transcripts without timestamps
- You need timestamped data for training
How to Use:
- Navigate to workspace (Home)
- Click "STT Session"
- Select Force Alignment option
- Upload audio file
- Provide transcript text
- Process alignment
- Download result
- Upload to custom model as training dataset
Accessing Custom Models
Custom Models Page:
- Navigate from main menu
- View all organization's models
- Create new models
- Edit existing models
Permissions:
- All authenticated users can create and edit
- Only ORGADMIN and SYSOP can delete models
Model Management:
- Create models
- Upload datasets
- Train models
- View training status
- Use trained models in transcriptions
Next Steps
- Create a custom model on Custom Models page
- Upload training datasets
- Train the model
- Use trained model in STT Session
- Or create glossary for term replacements
Improve accuracy! Use custom models and glossaries together for best transcription results.
Note: Custom model features and availability vary based on your subscription plan. Contact your organization administrator for details about custom model access and capabilities.
Next Steps
- Using Custom Models - Learn about deployment
- Glossaries - Alternative for terminology
- Contact your administrator for custom model access