What are Custom Models?
Custom language models allow you to train specialized speech recognition models tailored to your specific domain, improving transcription accuracy for your unique content.
Overview
A custom model is a specialized version of Scriptix's base language models that has been trained with your own audio data. This training helps the model better recognize:
- Domain-specific terminology - Medical, legal, technical, or industry-specific terms
- Product and company names - Your organization's specific vocabulary
- Accents and speaking styles - Particular speakers or regional accents
- Audio environments - Specific recording conditions or setups
How Custom Models Work
Base Models vs. Custom Models
Base Models:
- Pre-trained on general speech data
- Work well for standard conversational speech
- Available immediately for all users
- Cover common vocabulary and patterns
Custom Models:
- Built on top of a base language model
- Trained with your specific audio and transcripts
- Optimized for your domain and use case
- Require training time and data preparation
The Training Process
- Select Base Language - Choose which language to build upon (English, Dutch, French, etc.)
- Upload Training Data - Provide audio files with corresponding transcripts
- Train the Model - System learns from your data (takes several hours)
- Use in Production - Select your custom model when transcribing new audio
Benefits of Custom Models
Improved Accuracy
- 15-30% improvement for specialized terminology
- Fewer errors on domain-specific words
- Better recognition of uncommon names and terms
- Reduced post-transcription editing time
Consistency
- Same terminology spelled consistently
- Product names recognized correctly every time
- Standard formatting for specialized terms
Time Savings
- Less manual correction needed
- Faster transcript finalization
- More productive workflow
Domain Expertise
- Models understand your industry's language
- Better handling of technical jargon
- Improved recognition of field-specific patterns
Use Cases
Medical & Healthcare
What it helps with:
- Medical terminology (diagnoses, procedures, medications)
- Anatomical terms
- Clinical abbreviations
- Healthcare provider names
Example:
- Before: "The patient has a my o cardial infarction"
- After: "The patient has a myocardial infarction"
Legal
What it helps with:
- Legal terminology and procedures
- Case names and citations
- Party names and titles
- Court-specific vocabulary
Example:
- Before: "The plaintiff filed a motion for summary judgment"
- After: "The plaintiff filed a Motion for Summary Judgment"
Corporate & Business
What it helps with:
- Company and product names
- Internal terminology and acronyms
- Department and project names
- Industry-specific jargon
Example:
- Before: "We're using post gress sequel database"
- After: "We're using PostgreSQL database"
Technical & IT
What it helps with:
- Software and technology names
- Programming terminology
- Technical specifications
- System and tool names
Requirements
Training Data
To create a custom model, you need:
Audio Files:
- High-quality recordings
- Representative of your target content
- Ideally 2-20 hours of audio
- WAV, MP3, or other standard formats
Transcripts:
- Accurate text transcriptions of the audio
- Properly formatted and edited
- Matching the audio exactly
- Plain text format
Subscription Plan
Custom models may require:
- Specific subscription tier
- Training capacity allocation
- Storage for model data
Contact your organization administrator for details about custom model availability on your plan.
Comparison with Glossaries
Custom models and glossaries both improve accuracy but work differently:
Custom Models
Best for:
- Large volumes of specialized content
- Comprehensive domain coverage
- Ongoing, regular use
- Multiple types of terminology
Requires:
- Training data (audio + transcripts)
- Training time (hours to days)
- Higher-tier plan
Glossaries
Best for:
- Specific terms and names
- Quick implementation
- Limited vocabulary (less than 500 terms)
- Occasional use
Requires:
- List of terms only
- Immediate availability
- Available on most plans
Recommendation: Use both together for maximum accuracy. Train a custom model for your domain, and add a glossary for specific product names or recent terminology.
Getting Started
Ready to create a custom model?
- Assess Your Needs - Review When to Use custom models
- Prepare Data - Gather audio and transcripts
- Create Model - Follow the Create Custom Model guide
- Train - Learn about the Training Process
- Use - Start Using Your Model in transcriptions
Limitations
What Custom Models Cannot Do:
- Transcribe languages they weren't trained for
- Handle completely new vocabulary without retraining
- Work across different languages (each language needs its own model)
- Improve audio quality issues
Training Limitations:
- Requires minimum amount of quality training data
- Training takes time (several hours to complete)
- May need periodic retraining as vocabulary evolves
Next Steps
- When to Use Custom Models - Determine if they're right for you
- Create Custom Model - Step-by-step creation guide
- Glossaries - Alternative accuracy option
Transform your transcription accuracy! Custom models bring domain expertise to speech recognition.