Skip to main content

Training Process

Learn how to train your custom language model with your audio and transcript data.

Training Overview

Training is the process of teaching your custom model to recognize speech patterns, vocabulary, and characteristics specific to your domain. This requires:

  1. Training Data - Audio files with matching transcripts
  2. Data Upload - Adding your files to the model
  3. Training Execution - Running the training process
  4. Validation - Confirming successful training

Before Training

Data Requirements

Audio Files:

  • Format: WAV, MP3, M4A, or other standard audio formats
  • Quality: Clear recordings with minimal background noise
  • Duration: Ideally 2-20 hours total
  • Content: Representative of your target transcriptions

Transcript Files:

  • Format: Plain text (.txt)
  • Content: Exact transcription of the audio
  • Accuracy: Must match audio precisely
  • Encoding: UTF-8 recommended

Audio-Transcript Pairing:

  • Each audio file must have a corresponding transcript
  • Filenames should match (e.g., recording1.mp3 + recording1.txt)
  • Content must align exactly

Data Preparation Tips

Audio Quality:

  • Use high-quality recordings when possible
  • Consistent recording environment
  • Clear speech, minimal overlapping
  • Similar audio characteristics to your target use case

Transcript Accuracy:

  • Transcripts must be highly accurate (95%+ accuracy)
  • Include all spoken words
  • Proper punctuation and capitalization
  • Match the audio exactly, word for word

Data Quantity:

  • More data generally = better results
  • Minimum: Usually 1-2 hours of audio
  • Recommended: 5-10 hours for good results
  • Optimal: 10-20+ hours for best results

Accessing Training Interface

  1. Navigate to Custom Models page
  2. Click on your model name or Edit icon (pencil)
  3. Model detail page opens
  4. Training interface displays

Training Status States

Your model progresses through these states:

1. Not Running (Gray Badge)

What it means:

  • Model created but no training started
  • No training data uploaded yet

What to do:

  • Upload training data (audio + transcripts)
  • Prepare to start training

2. Ready to Run (Amber Badge)

What it means:

  • Training data has been uploaded
  • Model is ready to begin training
  • Waiting for you to start the process

What to do:

  • Review uploaded data
  • Click "Start Training" or similar button
  • Training will begin

3. Running (Blue Badge)

What it means:

  • Training is actively in progress
  • System is processing your data
  • Model is learning from your audio and transcripts

What to do:

  • Wait for training to complete
  • Training typically takes several hours
  • Monitor progress if status updates are available
  • Do not interrupt the process

Duration:

  • Depends on data volume
  • Typically 2-12 hours
  • Larger datasets take longer

4. Success (Green Badge)

What it means:

  • Training completed successfully
  • Model is ready to use
  • Can be selected for transcription

What to do:

  • Model is now ready for production use
  • Start using it for transcriptions
  • Test accuracy with sample audio

5. Failed (Red Badge)

What it means:

  • Training encountered an error
  • Model is not ready to use
  • Issue needs investigation

What to do:

  • Check error messages or logs (if available)
  • Review training data quality
  • Contact support if needed
  • May need to retry training

Uploading Training Data

Via Model Detail Page

  1. Open your model (click Edit or model name)
  2. Look for training data upload section
  3. Upload your files:
    • Audio files
    • Corresponding transcript files
  4. System validates and processes files
  5. Status changes to "Ready to Run" when upload complete

Data Organization

Best Practice:

  • Organize files in pairs
  • Use clear, matching filenames
  • Example:
    • session01.mp3 + session01.txt
    • meeting_02.wav + meeting_02.txt
    • call_003.mp3 + call_003.txt

Starting Training

Once data is uploaded and status is "Ready to Run":

  1. Review uploaded data
  2. Click "Start Training" or "Train Model" button
  3. Confirm training start (if prompted)
  4. Status changes to "Running" (blue)
  5. Training begins processing

Note: You cannot stop training once started. Ensure everything is ready before starting.

During Training

What Happens

The system:

  1. Processes your audio files
  2. Analyzes the matching transcripts
  3. Learns patterns and vocabulary
  4. Optimizes the model for your domain
  5. Validates the training results

Monitoring Progress

Status Updates:

  • Check the Training Status badge
  • May show progress percentage (if available)
  • Estimated completion time (if shown)

What You Can Do:

  • Monitor status periodically
  • Continue other work
  • Do not delete the model while training
  • Wait for completion

Training Duration

Typical Timelines:

  • Small dataset (1-3 hours audio): 2-4 hours training
  • Medium dataset (5-10 hours audio): 4-8 hours training
  • Large dataset (10-20 hours audio): 8-12+ hours training

Factors Affecting Duration:

  • Amount of training data
  • Audio file sizes
  • System load
  • Model complexity

After Training

Successful Training

When status shows "Success" (green):

  1. Model is ready to use
  2. Available for selection in transcription
  3. Will improve accuracy for your domain

Next Steps:

  • Test the model with sample audio
  • Compare accuracy to base model
  • Start using in production transcriptions

Failed Training

If status shows "Failed" (red):

  1. Check for error messages in model details
  2. Review training data quality
  3. Common issues:
    • Audio-transcript mismatch
    • Poor audio quality
    • Insufficient data
    • File format issues

Troubleshooting:

  • Verify audio and transcript pairs match
  • Check file formats are supported
  • Ensure transcripts are accurate
  • Try with different/better data
  • Contact support with error details

Retraining

When to Retrain

Consider retraining your model when:

  • Vocabulary has evolved significantly
  • New terminology emerged
  • Accuracy has decreased
  • You have additional quality training data
  • Quarterly or bi-annual maintenance

How to Retrain

  1. Gather new/additional training data
  2. Open your existing model
  3. Upload new data
  4. Start training again
  5. New training replaces previous version

Note: Retraining overwrites the previous model. Test before deploying to production.

Best Practices

Data Quality

Ensure High Quality:

  • Accurate transcripts (95%+ accuracy)
  • Clear audio recordings
  • Consistent audio quality
  • Representative of target use case

Incremental Approach

Start Small, Expand:

  1. Begin with 2-5 hours of best quality data
  2. Train and test model
  3. Evaluate accuracy improvements
  4. Add more data if needed
  5. Retrain with expanded dataset

Testing

Validate Training Success:

  1. Use model on test audio
  2. Compare to base model results
  3. Measure accuracy improvement
  4. Verify domain-specific terms recognized
  5. Gather user feedback

Maintenance

Regular Updates:

  • Review model performance quarterly
  • Add new terminology as needed
  • Retrain with fresh data periodically
  • Keep training data organized and backed up

Troubleshooting

Training Won't Start

Problem: Can't start training despite uploading data

Solutions:

  • Verify status is "Ready to Run"
  • Check all data uploaded correctly
  • Ensure audio-transcript pairs complete
  • Try refreshing the page

Training Stuck at "Running"

Problem: Training status stuck, not progressing

Solutions:

  • Wait longer (training takes hours)
  • Refresh page to check for updates
  • Contact support if stuck for 24+ hours

Training Failed

Problem: Status shows "Failed"

Solutions:

  • Review error messages
  • Check training data quality
  • Verify file formats supported
  • Ensure audio-transcript matching
  • Try with different data
  • Contact support with details

Model Not Available After Success

Problem: Training succeeded but model not usable

Solutions:

  • Refresh the page
  • Check model list
  • Verify status is green "Success"
  • Contact support if issue persists

Next Steps

After successful training:

  • Use in Transcription - Apply your trained model
  • Test with sample audio
  • Compare accuracy improvements
  • Deploy to production use

Training complete! Your custom model is ready to improve transcription accuracy.