Batch Transcription API
Upload audio and video files for asynchronous speech-to-text transcription.
What is Batch Transcription?
Batch transcription processes pre-recorded audio/video files asynchronously:
- Upload audio or video file
- Wait for processing (minutes to hours depending on file size)
- Retrieve transcript with timestamps, speakers, and formatting
Best for: Podcasts, interviews, meetings, video subtitles, recorded content
Quick Start
1. Upload File
curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "language=en" \
-F "audio_file=@meeting.mp3"
Response:
{
"id": "stt_abc123",
"status": "processing",
"language": "en",
"created_at": "2025-01-17T10:00:00Z"
}
2. Check Status
curl https://api.scriptix.io/api/v3/stt/stt_abc123/status \
-H "Authorization: Bearer YOUR_API_KEY"
Response:
{
"id": "stt_abc123",
"status": "completed",
"progress": 100,
"document_id": 456
}
3. Get Transcript
curl https://api.scriptix.io/api/v3/stt/stt_abc123/result \
-H "Authorization: Bearer YOUR_API_KEY"
Response:
{
"id": "stt_abc123",
"document_id": 456,
"transcript": "Hello, welcome to today's meeting...",
"segments": [
{
"start": 0.0,
"end": 2.5,
"text": "Hello, welcome to today's meeting",
"speaker": "Speaker 1"
}
]
}
Supported File Formats
Audio Formats
- MP3 (.mp3)
- WAV (.wav)
- FLAC (.flac)
- M4A (.m4a)
- AAC (.aac)
- OGG (.ogg)
Video Formats
- MP4 (.mp4)
- MOV (.mov)
- AVI (.avi)
- MKV (.mkv)
- WEBM (.webm)
Note: Audio is extracted automatically from video files.
File Size Limits
| Upload Method | Max File Size | Best For |
|---|---|---|
| Standard Upload | 500 MB | Most files |
| TUS Upload | 5 GB | Large files, unstable connections |
For files > 500MB, use TUS Upload.
Processing Time
Typical processing time: 10-20% of audio duration
| Audio Duration | Processing Time |
|---|---|
| 10 minutes | 1-2 minutes |
| 1 hour | 6-12 minutes |
| 3 hours | 18-36 minutes |
Factors affecting speed:
- File size and format
- Audio quality
- Features enabled (speaker diarization, etc.)
- Current system load
Features
Speaker Diarization
Automatically detect and label different speakers:
curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "language=en" \
-F "diarization=true" \
-F "audio_file=@meeting.mp3"
Timestamps
Get word-level timestamps:
curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "language=en" \
-F "timestamps=word" \
-F "audio_file=@audio.mp3"
Custom Models
Use custom models for domain-specific accuracy:
curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "model=custom_model_123" \
-F "language=en" \
-F "audio_file=@medical.mp3"
Glossaries
Apply custom glossaries for terminology:
curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "language=en" \
-F "glossary_id=789" \
-F "audio_file=@audio.mp3"
Webhooks
Receive notifications when transcription completes:
curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "language=en" \
-F "webhook_url=https://yourapp.com/webhook" \
-F "audio_file=@audio.mp3"
See Webhooks Guide.
Status Lifecycle
uploaded → queued → processing → completed
→ failed
| Status | Description |
|---|---|
uploaded | File uploaded successfully |
queued | Waiting in processing queue |
processing | Transcription in progress |
completed | Transcription finished |
failed | Error occurred |
API Endpoints
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/v3/stt | Upload file for transcription |
| POST | /api/v3/stt/tus | Initialize TUS upload |
| GET | /api/v3/stt/{id}/status | Check transcription status |
| GET | /api/v3/stt/{id}/result | Get transcript result |
Pricing
Batch transcription is charged per audio minute:
| Plan | Price per Minute | Included Minutes |
|---|---|---|
| Free | - | 60 min/month |
| Bronze | $0.006 | 500 min/month |
| Silver | $0.005 | 2,000 min/month |
| Gold | $0.004 | 10,000 min/month |
| Enterprise | Custom | Custom |
Additional features:
- Speaker diarization: +$0.002/min
- Custom models: Included (Gold+)
- Priority processing: Included (Gold+)
Best Practices
1. Use Webhooks
Don't poll for status - use webhooks:
# ❌ Polling (wastes API calls)
while True:
status = check_status(job_id)
if status == 'completed':
break
time.sleep(10)
# ✅ Webhooks (efficient)
upload_file(audio, webhook_url='https://yourapp.com/webhook')
2. Specify Language
Always specify language for best accuracy:
# ✅ Specify language
-F "language=en"
# ⚠️ Auto-detection (slower)
-F "language=auto"
3. Optimize Audio Files
- Use compressed formats (MP3, AAC) for faster upload
- Minimum sample rate: 16kHz
- Mono audio sufficient for most use cases
4. Handle Errors
Implement retry logic for failed transcriptions:
def transcribe_with_retry(file_path, max_retries=3):
for attempt in range(max_retries):
try:
result = transcribe_file(file_path)
if result['status'] != 'failed':
return result
except Exception as e:
if attempt == max_retries - 1:
raise
time.sleep(2 ** attempt)
Complete Example
import requests
import time
API_KEY = 'YOUR_API_KEY'
BASE_URL = 'https://api.scriptix.io/api/v3'
def transcribe_file(file_path, language='en'):
# 1. Upload file
with open(file_path, 'rb') as f:
response = requests.post(
f'{BASE_URL}/stt',
headers={'Authorization': f'Bearer {API_KEY}'},
files={'audio_file': f},
data={'language': language, 'diarization': 'true'}
)
job = response.json()
job_id = job['id']
print(f"Uploaded: {job_id}")
# 2. Poll status
while True:
status_response = requests.get(
f'{BASE_URL}/stt/{job_id}/status',
headers={'Authorization': f'Bearer {API_KEY}'}
)
status_data = status_response.json()
if status_data['status'] == 'completed':
break
elif status_data['status'] == 'failed':
raise Exception("Transcription failed")
print(f"Progress: {status_data.get('progress', 0)}%")
time.sleep(10)
# 3. Get result
result_response = requests.get(
f'{BASE_URL}/stt/{job_id}/result',
headers={'Authorization': f'Bearer {API_KEY}'}
)
return result_response.json()
# Usage
result = transcribe_file('meeting.mp3', language='en')
print(result['transcript'])
Next Steps
- Upload Files - Detailed upload guide
- TUS Upload - Large file upload
- Check Status - Status polling
- Retrieve Results - Get transcripts
- Webhooks - Webhook setup
Ready to transcribe? Start with Upload Files.