Skip to main content

Create Custom Model

Custom models improve transcription accuracy for specialized vocabularies, industry terms, or unique audio conditions.

What Are Custom Models?

Custom models are trained language models that enhance speech recognition:

  • Built on base language models
  • Trained with your specific audio and transcripts
  • Improve accuracy for domain-specific terminology
  • Require training data (audio files and/or transcripts)
  • Must be trained before use

Requirements

To Create a Custom Model:

  • Name for the model
  • Base language selection (from trainable languages)
  • Organization membership

To Train a Custom Model:

  • Dataset files uploaded (audio and/or transcripts)
  • Training credits available
  • Model validation passes

Creating a New Custom Model

From Custom Models Page:

  1. Navigate to Custom Models page
  2. Click "New Custom Model" button
  3. Modal opens with creation form
  4. Enter model name (required)
  5. Select base language from dropdown (required)
    • Only trainable languages shown
    • Language determines base model used
  6. Click "Create"
  7. Model created and appears in list

After Creation:

  • Model shows in custom models list
  • Training status: "Not Running"
  • Model cannot be used until trained
  • Must upload datasets and train

Custom Model List

Columns Displayed:

ColumnDescriptionInfo Tooltip
NameModel name-
Training StatusCurrent training state with badgeTraining progress indicator
Base LanguageLanguage name (e.g., "English")Base language model used
Last ModifiedLast update date-

Training Status Values:

  • Not Running (gray badge) - Not started or failed
  • Ready to Run (amber badge) - Datasets uploaded, ready to train
  • Running (blue badge) - Currently training
  • Success (green badge) - Training completed successfully
  • Failed (red badge) - Training failed

Actions Available:

  • Edit - Opens model details page
  • Delete - Remove model (ORGADMIN or SYSOP only)

Pagination:

  • 25 items per page (default)
  • Options: 25, 50, 75, 100 per page
  • Manual pagination controls at bottom

Model Details Page

Access model details by clicking model name or Edit action.

Page Sections:

Model Information Cards:

  • Name - Editable input field
  • Training Status - Badge with status
  • Base Language Model - Language name (read-only)
  • Last Modified - Date (read-only)

Header Actions:

  • Train the Language - Starts training (only when status allows)
  • Update - Saves name changes

Uploading Datasets

Datasets are training files used to improve the model.

Dataset Types:

  1. TRANSCRIPT - Text transcripts (.vtt, .srt, .txt)
  2. TEST - Test transcripts for validation (.vtt, .srt, .txt)
  3. AUDIO - Audio files (.wav, .mp3, .m4a, .flac)
  4. MANIFEST - Manifest files (.jsonl)

Supported File Formats:

  • Audio: .wav, .mp3, .m4a, .flac
  • Transcripts: .vtt, .srt, .txt
  • Manifest: .jsonl

Upload New Dataset:

  1. Click "New Dataset" button on model details page
  2. Modal opens
  3. Select dataset type:
    • Auto-detect (default)
    • TRANSCRIPT
    • AUDIO
  4. Upload files (1-10 files):
    • Click upload area or drag-and-drop
    • Select 1 to 10 files
    • Files validated against selected type
    • Maximum 10GB per file
  5. Click "Create"
  6. Files uploaded and added to datasets list

File Validation:

  • Minimum: 1 file required
  • Maximum: 10 files per upload
  • File size limit: 10GB per file
  • Format validation based on selected type

Auto-Detection:

  • Detects type from file extension
  • Uses first file's extension
  • Falls back to TRANSCRIPT for unknown types

Dataset Management

Dataset List Columns:

  • Name
  • Type (TRANSCRIPT, TEST, AUDIO, MANIFEST)
  • Duration (if available)
  • URL
  • Start Time
  • End Time

Dataset Actions:

  • Delete - Remove dataset file (via three-dot menu)

Pagination:

  • 25 items per page (default)
  • Options: 25, 50, 75, 100 per page

Training the Model

Training Process:

  1. Upload datasets (audio and/or transcripts)
  2. Check status - Model shows "Ready to Run" when datasets uploaded
  3. Click "Train the Language" button
  4. Confirmation dialog appears
  5. Click "Train" to confirm
  6. Validation runs - Model validates datasets
  7. Training starts - Status changes to "Running"
  8. Wait for completion - Status changes to "Success" or "Failed"

Training Status Messages:

Not Running:

  • Upload datasets to get started
  • Add audio files and/or transcripts
  • Status will change to "Ready to Run"

Ready to Run:

  • Datasets uploaded
  • Click "Train the Language" to start
  • Training credits required

Running:

  • Training in progress
  • Wait for completion
  • Can take time depending on dataset size

Success:

  • Training completed successfully
  • Model ready to use
  • Select model when creating transcripts

Failed:

  • Training failed
  • Check datasets and try again
  • Upload new datasets if needed

Training Requirements:

  • Training credits available in organization
  • Valid datasets uploaded
  • Datasets pass validation
  • Cannot train while already training
  • Cannot train if training succeeded (status 4)

Training Errors:

  • validation_failed - Datasets failed validation
  • invalid_audio_format - Audio files invalid
  • invalid_transcript_format - Transcript files invalid
  • no-claims-available - No training credits available

Using Trained Models

After training succeeds:

  1. Navigate to Home (workspace) page
  2. Click "STT Session" to create new transcript
  3. Select language from dropdown
  4. Select custom model from model dropdown
    • Trained custom models appear in list
    • Model name shown with base language
  5. Upload or select audio file
  6. Start transcription
  7. Model used for improved accuracy

Force Alignment:

  • Use Force Alignment mode in STT Session
  • Provide audio and existing transcript
  • Custom model improves alignment accuracy
  • Creates time-synced transcript

Updating Models

Update Model Name:

  1. Open model details page
  2. Edit name in Name field
  3. Click "Update" button
  4. Name saved

Language and Organization:

  • Base language cannot be changed after creation
  • Organization ID fixed at creation
  • To change language, create new model

Deleting Models

Delete Custom Model:

  1. Locate model in custom models list
  2. Click three-dot menu
  3. Select "Delete"
  4. Confirmation dialog appears
  5. Confirm deletion
  6. Model permanently removed

Delete Restrictions:

  • Only ORGADMIN or SYSOP can delete
  • Delete action not available for other roles
  • Deletion is permanent

Delete Datasets:

  1. Open model details page
  2. Locate dataset in datasets list
  3. Click three-dot menu on dataset row
  4. Select "Delete"
  5. Confirmation dialog appears
  6. Confirm deletion
  7. Dataset removed from model

Info Banner

Information banner on custom models page explains:

  • What custom models are
  • How they work with training data
  • Link to Force Alignment feature for creating transcripts
  • How they improve accuracy
  • Instructions to select model after training

Info Banner Sections:

  • What are custom models
  • Easy setup with training data
  • Use Force Alignment feature (link to Home page)
  • Improves accuracy over time
  • Select after training in STT Session

Getting Started Guide

First-time modal appears on first visit to custom models page.

Guide Sections:

  1. What are custom models - Explanation of functionality
  2. What you need - Requirements for training
  3. Only have audio files - Information about Force Alignment
  4. After training - How to use trained model

Access Guide:

  • Click "Getting Started" button in header
  • Modal opens with guide information
  • Click "Got It" to close

Guide Tracking:

  • Shown automatically on first visit
  • Stored in localStorage: customModelsGuideShown
  • Can reopen manually via button

Training Status Details

Banner on model details page shows current status and help:

Banner Contents:

  • Title with current training status
  • Help message appropriate for status
  • Link to Force Alignment feature
  • File requirements note

Banner Behavior:

  • Always visible on model details page
  • Expands by default when status is "Success" (4)
  • Collapsed by default for other statuses
  • Click to expand/collapse

Status-Specific Help:

  • Status 1 (Not Running): Upload datasets to start
  • Status 2 (Ready to Run): Click Train button
  • Status 3 (Running): Wait for completion
  • Status 4 (Success): Model ready to use
  • Status 5 (Failed): Try again with new datasets

Best Practices

Creating Models:

  • Use descriptive names
  • Select correct base language
  • One model per use case or domain

Uploading Datasets:

  • Upload quality audio files
  • Include accurate transcripts
  • Use domain-specific content
  • Upload multiple files for better training
  • Stay within 10GB per file limit

Training:

  • Upload all datasets before training
  • Verify datasets are correct
  • Ensure training credits available
  • Wait for training to complete before using

Using Models:

  • Select trained model in STT Session
  • Use for appropriate language only
  • Test accuracy improvements
  • Retrain if needed with more data

Model created! Now proceed to upload training data and start the training process.

Next Steps