Skip to main content

Supported Languages

Scriptix supports 40+ languages for speech-to-text transcription.

Language Codes

Use ISO 639-1 two-letter language codes in API requests.

All Supported Languages

European Languages

LanguageCodeNative Name
EnglishenEnglish
DutchnlNederlands
FrenchfrFrançais
GermandeDeutsch
SpanishesEspañol
ItalianitItaliano
PortugueseptPortuguês
PolishplPolski
CzechcsČeština
DanishdaDansk
FinnishfiSuomi
GreekelΕλληνικά
HungarianhuMagyar
NorwegiannoNorsk
RomanianroRomână
SlovakskSlovenčina
SwedishsvSvenska
RussianruРусский
UkrainianukУкраїнська

Middle Eastern Languages

LanguageCodeNative Name
Arabicarالعربية
Hebrewheעברית
TurkishtrTürkçe
Persian/Farsifaفارسی

Asian Languages

LanguageCodeNative Name
Chinesezh中文
Japaneseja日本語
Koreanko한국어
Hindihiहिन्दी
Thaithไทย
IndonesianidBahasa Indonesia
VietnameseviTiếng Việt
MalaymsBahasa Melayu
Tamiltaதமிழ்
Teluguteతెలుగు

Usage Examples

Specify Language

curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "language=en" \
-F "audio_file=@audio.mp3"

Python

response = requests.post(
'https://api.scriptix.io/api/v3/stt',
headers={'Authorization': 'Bearer YOUR_API_KEY'},
files={'audio_file': open('audio.mp3', 'rb')},
data={'language': 'en'}
)

Automatic Language Detection

For automatic language detection, use "auto":

curl -X POST https://api.scriptix.io/api/v3/stt \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "language=auto" \
-F "audio_file=@audio.mp3"

Note: Language detection works best with 10+ seconds of clear speech.

Language-Specific Features

Custom Models

Custom models can be trained for any supported language:

curl -X POST https://api.scriptix.io/api/v3/custom_models \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Medical Dutch",
"language": "nl",
"base_model": "medical"
}'

Translation

Translate transcripts between supported languages:

curl -X POST https://api/v3/documents/123/translate \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"target_language": "fr"}'

Best Practices

1. Always Specify Language

For best accuracy, specify the exact language:

# ✅ Specify language
data = {'language': 'en', 'audio_file': file}

# ⚠️ Auto-detection (slower, less accurate)
data = {'language': 'auto', 'audio_file': file}

2. Use Correct Dialect

For languages with dialects, use the closest code:

  • English (US/UK/AU) → en
  • Portuguese (BR/PT) → pt
  • Spanish (ES/LATAM) → es
  • Chinese (Simplified/Traditional) → zh

3. Custom Models for Accents

For heavy accents or regional variations, train a custom model.

Language Support by Feature

FeatureSupported Languages
TranscriptionAll 40+ languages
Custom ModelsAll 40+ languages
GlossariesAll 40+ languages
Translation30+ languages
Real-timeAll 40+ languages
Speaker DiarizationAll 40+ languages

Accuracy by Language

Accuracy varies by language based on training data:

TierLanguagesWER Range
Tier 1 (Highest)en, nl, de, fr, es5-10%
Tier 2it, pt, pl, ru, zh, ja8-15%
Tier 3Other languages10-20%

WER = Word Error Rate (lower is better)

Requesting New Languages

Don't see your language? Contact us:

Include:

  • Language name and ISO code
  • Use case description
  • Expected monthly volume