Skip to main content

Train Custom Model

Start the training process and monitor progress for your custom model.

Start Training

Endpoint

POST /api/v3/custom_models/{id}/run

Authentication

Requires API key with custom_models:write scope.

Request

Path Parameters

ParameterTypeRequiredDescription
idintegerYesModel ID

Headers

| Header | Value | Required | | Authorization | Bearer YOUR_API_KEY | Yes | | Content-Type | application/json | Yes |

Request Body

{}

Note: Empty JSON object {} required in request body.

Prerequisites

Before starting training, the model must:

  • Have training_status = 2 (Ready to Run)
  • Have at least 5 hours of training audio uploaded
  • Have at least one test data file uploaded (recommended)

Response

Success Response

Status Code: 200 OK

{
"id": 123,
"model_key": "custom_model_123",
"name": "Medical Cardiology EN",
"training_status": 3,
"status_message": "Training started successfully",
"training_progress": 0,
"started_training_at": "2025-01-16T08:00:00Z",
"updated_at": "2025-01-16T08:00:00Z"
}

Training status changes from 2 (Ready to Run) to 3 (Running).

Examples

cURL

curl -X POST https://api.scriptix.io/api/v3/custom_models/123/run \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{}'

Python

import requests

def start_training(model_id):
"""Start training for a custom model."""
url = f"https://api.scriptix.io/api/v3/custom_models/{model_id}/run"
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}

response = requests.post(url, headers=headers, json={})
model = response.json()

print(f"Training started for model {model_id}")
print(f"Status: {model['status_message']}")
print(f"Started at: {model['started_training_at']}")

return model

# Usage
model = start_training(123)

JavaScript

async function startTraining(modelId) {
const response = await axios.post(
`https://api.scriptix.io/api/v3/custom_models/${modelId}/run`,
{},
{
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
}
}
);

const model = response.data;
console.log(`Training started: ${model.status_message}`);
return model;
}

// Usage
await startTraining(123);

Error Responses

409 Conflict - Model Not Ready

{
"error": "Conflict",
"message": "Cannot start training: model not ready",
"error_code": "MODEL_NOT_READY",
"details": {
"current_status": 1,
"required_status": 2,
"message": "Upload training data before starting training"
}
}

Solution: Upload training data first.

409 Conflict - Training Already Running

{
"error": "Conflict",
"message": "Training already in progress",
"error_code": "TRAINING_IN_PROGRESS",
"details": {
"training_progress": 45
}
}

Solution: Wait for current training to complete.

400 Bad Request - Insufficient Data

{
"error": "Bad Request",
"message": "Insufficient training data",
"error_code": "INSUFFICIENT_DATA",
"details": {
"current_hours": 2.5,
"required_hours": 5.0
}
}

Solution: Upload more training data (minimum 5 hours).

Monitor Training Progress

Endpoint

GET /api/v3/custom_models/{id}

Poll Training Status

import time

def poll_training_status(model_id, poll_interval=60):
"""
Poll training status until completion or failure.

Args:
model_id: Model ID to monitor
poll_interval: Seconds between polls (default 60)

Returns:
Final model object when training completes or fails
"""
url = f"https://api.scriptix.io/api/v3/custom_models/{model_id}"
headers = {"Authorization": "Bearer YOUR_API_KEY"}

print(f"Monitoring training for model {model_id}...")
print("This typically takes 6-12 hours\n")

while True:
response = requests.get(url, headers=headers)
model = response.json()

status = model['training_status']
message = model['status_message']

if status == 3: # Running
progress = model.get('training_progress', 0)
print(f"[{time.strftime('%H:%M:%S')}] Training: {progress}% - {message}")

elif status == 4: # Success
print(f"\n✓ Training completed successfully!")
metrics = model['training_metrics']
print(f"\nPerformance Metrics:")
print(f" Word Error Rate: {metrics['word_error_rate']}%")
print(f" Accuracy Improvement: {metrics['accuracy_improvement']}%")
print(f" Training Duration: {metrics.get('training_hours', 'N/A')} hours")

return model

elif status == 5: # Failed
print(f"\n✗ Training failed!")
print(f"Error: {model.get('error', 'Unknown error')}")

return model

else:
print(f"Status: {message}")

# Wait before next poll
time.sleep(poll_interval)

# Usage
final_model = poll_training_status(123, poll_interval=60)

Advanced Polling with Timeout

from datetime import datetime, timedelta

def poll_with_timeout(model_id, poll_interval=60, timeout_hours=24):
"""Poll with timeout and exponential backoff."""
url = f"https://api.scriptix.io/api/v3/custom_models/{model_id}"
headers = {"Authorization": "Bearer YOUR_API_KEY"}

start_time = datetime.now()
timeout = timedelta(hours=timeout_hours)

while datetime.now() - start_time < timeout:
try:
response = requests.get(url, headers=headers)
model = response.json()

status = model['training_status']

if status == 3: # Running
progress = model.get('training_progress', 0)
elapsed = datetime.now() - start_time
print(f"[{elapsed}] Training: {progress}%")

# Estimate time remaining
if progress > 0:
total_time = (elapsed.total_seconds() / progress) * 100
remaining = total_time - elapsed.total_seconds()
print(f" Estimated time remaining: {remaining/3600:.1f} hours")

elif status in [4, 5]: # Completed or failed
return model

time.sleep(poll_interval)

except requests.exceptions.RequestException as e:
print(f"Network error: {e}. Retrying...")
time.sleep(poll_interval)

raise TimeoutError(f"Training did not complete within {timeout_hours} hours")

# Usage
try:
model = poll_with_timeout(123, poll_interval=60, timeout_hours=24)
except TimeoutError as e:
print(f"Error: {e}")

Training Progress States

training_status = 3 (Running)
├─ training_progress: 0-10% → Data preprocessing
├─ training_progress: 10-30% → Initial training epochs
├─ training_progress: 30-70% → Main training
├─ training_progress: 70-90% → Fine-tuning
└─ training_progress: 90-100% → Final evaluation

training_status = 4 (Success)
└─ training_metrics available

training_status = 5 (Failed)
└─ error message available

Get Training Log

View detailed training logs for debugging and monitoring.

Endpoint

GET /api/v3/custom_models/{id}/log

Response

Status Code: 200 OK

{
"model_id": 123,
"log_entries": [
{
"timestamp": "2025-01-16T08:00:00Z",
"level": "info",
"message": "Training started"
},
{
"timestamp": "2025-01-16T08:05:00Z",
"level": "info",
"message": "Preprocessing audio files..."
},
{
"timestamp": "2025-01-16T08:15:00Z",
"level": "info",
"message": "Training epoch 1/20 - WER: 45.2%"
},
{
"timestamp": "2025-01-16T09:00:00Z",
"level": "info",
"message": "Training epoch 5/20 - WER: 28.3%"
},
{
"timestamp": "2025-01-16T15:30:00Z",
"level": "info",
"message": "Training completed - Final WER: 12.5%"
}
],
"total_entries": 156
}

Example

def get_training_log(model_id):
"""Fetch training logs."""
url = f"https://api.scriptix.io/api/v3/custom_models/{model_id}/log"
headers = {"Authorization": "Bearer YOUR_API_KEY"}

response = requests.get(url, headers=headers)
log_data = response.json()

print(f"Training log for model {model_id}")
print(f"Total entries: {log_data['total_entries']}\n")

for entry in log_data['log_entries']:
timestamp = entry['timestamp']
level = entry['level'].upper()
message = entry['message']
print(f"[{timestamp}] {level}: {message}")

# Usage
get_training_log(123)

Training Timeline

Typical training timeline for a 20-hour dataset:

Hour 0:00 → Training started (status=3, progress=0%)
Hour 0:15 → Data preprocessing (progress=5%)
Hour 0:30 → Initial epochs begin (progress=10%)
Hour 2:00 → Epoch 5/20 (progress=30%)
Hour 4:00 → Epoch 10/20 (progress=50%)
Hour 6:00 → Epoch 15/20 (progress=75%)
Hour 7:30 → Final evaluation (progress=95%)
Hour 8:00 → Training completed (status=4, progress=100%)

Factors affecting duration:

  • Data size: More hours = longer training
  • Audio quality: Poor quality may require more epochs
  • System load: Training queue and resource availability

Complete Training Workflow

def complete_training_workflow(model_id):
"""Complete workflow: start training → monitor → use model."""

# 1. Verify model is ready
model = get_model(model_id)
if model['training_status'] != 2:
print(f"Error: Model not ready. Status: {model['status_message']}")
return None

print(f"Model ready: {model['name']}")

# 2. Start training
print("\nStarting training...")
start_training(model_id)

# 3. Monitor progress
print("\nMonitoring progress...")
final_model = poll_training_status(model_id, poll_interval=60)

# 4. Check results
if final_model['training_status'] == 4:
print("\n✓ Training successful!")
metrics = final_model['training_metrics']

# Save metrics
with open(f"model_{model_id}_metrics.json", 'w') as f:
json.dump(metrics, f, indent=2)
print(f"Metrics saved to model_{model_id}_metrics.json")

# Model is now ready for use
print(f"\nModel ready for production use:")
print(f" model_key: {final_model['model_key']}")
print(f" WER: {metrics['word_error_rate']}%")
print(f" Improvement: {metrics['accuracy_improvement']}%")

return final_model

else:
print("\n✗ Training failed")
print(f"Error: {final_model.get('error')}")

# Get logs for debugging
get_training_log(model_id)

return None

# Usage
model = complete_training_workflow(123)

Training Metrics Explained

After successful training (training_status = 4), review these metrics:

{
"training_metrics": {
"word_error_rate": 12.5,
"character_error_rate": 8.3,
"accuracy_improvement": 18.7,
"training_hours": 7.5,
"test_word_error_rate": 11.2,
"training_word_error_rate": 10.8
}
}

Metric Definitions

MetricDescriptionGood Value
word_error_rateTest set WER (% of incorrect words)< 15%
character_error_rateTest set CER (% of incorrect characters)< 10%
accuracy_improvement% improvement over base model> 15%
training_hoursHours of audio used in training10+ hours
test_word_error_rateSame as word_error_rate< 15%
training_word_error_rateTraining set WERUsually lower than test WER

Interpreting Results

Excellent (WER < 10%):

  • Production-ready for most use cases
  • Professional transcription quality
  • Consider deploying immediately

Very Good (WER 10-15%):

  • Suitable for most applications
  • May need light editing
  • Good for internal use

Good (WER 15-25%):

  • Significant improvement over base
  • Consider adding more training data
  • Usable with moderate editing

Poor (WER > 25%):

  • Review training data quality
  • Ensure transcripts are accurate
  • Consider adding more diverse data
  • Check audio quality

Troubleshooting Training Failures

Common Failure Reasons

1. Insufficient Training Data

{
"error": "Insufficient training data (minimum 5 hours required, found 2.3 hours)"
}

Solution: Upload more training data (minimum 5 hours, recommend 10+ hours).

2. Poor Transcript Quality

{
"error": "Training failed: transcript accuracy below threshold"
}

Solution:

  • Review transcript accuracy (need 95%+)
  • Ensure transcripts match audio exactly
  • Fix any transcription errors

3. Audio/Transcript Mismatch

{
"error": "Audio and transcript mismatch detected"
}

Solution:

  • Verify audio and transcript pairs match
  • Check transcript timing alignment
  • Ensure correct file pairs uploaded

4. System Error

{
"error": "Training failed due to system error"
}

Solution:

  • Check training logs via /log endpoint
  • Contact support with model ID
  • May need to retry training

Best Practices

1. Monitor Regularly

Poll every 60 seconds during training:

poll_training_status(model_id, poll_interval=60)

2. Set Timeouts

Don't poll indefinitely:

poll_with_timeout(model_id, timeout_hours=24)

3. Handle Errors

Implement retry logic for network errors:

try:
model = get_model(model_id)
except requests.exceptions.RequestException:
time.sleep(60)
# Retry

4. Save Metrics

Always save training metrics for documentation:

with open(f"model_{model_id}_metrics.json", 'w') as f:
json.dump(model['training_metrics'], f)

5. Notify on Completion

Set up notifications when training completes:

if model['training_status'] == 4:
send_email(f"Model {model_id} training completed!")

Rate Limits

  • Start training: 5 requests/hour
  • Status checks: 100 requests/minute
  • Log retrieval: 20 requests/hour

Next Steps

After successful training:

  1. Test Model: Use in test transcriptions
  2. Validate Accuracy: Compare against production audio
  3. Deploy: Update applications to use model_key
  4. Monitor: Track real-world performance
  5. Iterate: Periodically retrain with new data

See Custom Models Overview for complete guide.