β‘ Performance Tips
This guide outlines best practices for optimizing the responsiveness and transcription quality when using the Scriptix Real-time API.
Performance is influenced by how you encode and stream audio. The size, frequency, and format of your audio chunks all affect both latency and accuracy.
π― Recommended Chunk Sizeβ
To achieve the best balance between speed and quality:
Audio Format | Recommended Chunk Size | Reason |
---|---|---|
PCM 16kHz (default) | 8 KB β 64 KB | Maintains low latency while preserving context |
PCM 8kHz (call center) | Adjust accordingly | Smaller bandwidth, slower responseβrequires tuning |
With 16kHz 16-bit PCM audio, 1 second of audio β 32 KB of data. So, sending 256 ms of audio β 8 KB.
π Chunk Size vs. Performanceβ
Chunk Size | Latency | Accuracy |
---|---|---|
~4 KB | β Very fast | β οΈ May reduce contextual quality |
8β32 KB | β Fast | β Good balance |
64 KB+ | β οΈ Slower | β High contextual accuracy |
β Tip: Test with your actual audio source. Some streams benefit more from context than others.
π§ Why Size Mattersβ
Smaller chunks result in:
- Faster responses
- Less contextual information for the model
Larger chunks result in:
- Slower responses
- More accurate transcriptions due to richer context
π Special Case: 8kHz Audio Modelsβ
Scriptix offers 8kHz private models for specific use cases like call center transcriptions.
If you're using an 8kHz model:
- Use 16-bit little-endian PCM audio
- Adjust chunk size to match the lower sample rate (e.g., 1 second β 16 KB)
- Expect slightly higher latency but optimized for narrow-band audio
π© Contact Scriptix support if you're interested in using 8kHz models.
β Final Best Practicesβ
- Stream regularly β Avoid sending large bursts or long gaps
- Maintain audio rate β Consistent format = consistent results
- Monitor round-trip time (RTT) β Latency spikes may indicate buffer or network issues
- Test different chunk sizes β Depending on your use case, smaller or larger blocks may yield better trade-offs
Related Topicsβ
- Audio Encoding β Accepted formats and conversion tips
- WebSocket Connection β Streaming setup and lifecycle
- Protocol β How transcript results are returned