Rate Limits
API requests are rate-limited to ensure platform stability and fair usage.
Rate Limit Tiers
Limits vary by subscription plan:
| Plan | Requests/Minute | Requests/Hour | Requests/Day | Concurrent |
|---|---|---|---|---|
| Free | 10 | 100 | 1,000 | 2 |
| Bronze | 30 | 500 | 5,000 | 5 |
| Silver | 60 | 2,000 | 20,000 | 10 |
| Gold | 120 | 5,000 | 50,000 | 20 |
| Enterprise | Custom | Custom | Custom | Custom |
Concurrent: Maximum simultaneous requests allowed.
Rate Limit Headers
Every API response includes rate limit information in headers:
HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1642089600
Content-Type: application/json
Header Definitions
| Header | Description | Example |
|---|---|---|
X-RateLimit-Limit | Maximum requests allowed in window | 100 |
X-RateLimit-Remaining | Requests remaining in current window | 95 |
X-RateLimit-Reset | Unix timestamp when limit resets | 1642089600 |
Reading Headers
import requests
from datetime import datetime
response = requests.get(
'https://api.scriptix.io/api/v3/documents',
headers={'Authorization': 'Bearer YOUR_API_KEY'}
)
limit = int(response.headers.get('X-RateLimit-Limit', 0))
remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
reset = int(response.headers.get('X-RateLimit-Reset', 0))
print(f"Limit: {limit}")
print(f"Remaining: {remaining}")
print(f"Resets at: {datetime.fromtimestamp(reset)}")
Rate Limit Exceeded
When you exceed rate limits, you receive a 429 Too Many Requests response:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/json
{
"error": "Rate Limit Exceeded",
"message": "Too many requests. Please try again in 60 seconds.",
"error_code": "RATE_LIMIT_EXCEEDED",
"retry_after": 60
}
Response Fields
| Field | Description |
|---|---|
error | Error type |
message | Human-readable description |
error_code | Machine-readable code |
retry_after | Seconds until limit resets |
Retry-After Header
The Retry-After header indicates seconds to wait:
Retry-After: 60
Handling Rate Limits
1. Exponential Backoff
Implement exponential backoff with retry logic:
import time
import requests
def api_call_with_retry(url, headers, max_retries=5):
"""Make API call with exponential backoff retry."""
for attempt in range(max_retries):
response = requests.get(url, headers=headers)
if response.status_code == 200:
return response.json()
elif response.status_code == 429:
# Rate limit exceeded
retry_after = int(response.headers.get('Retry-After', 60))
if attempt < max_retries - 1:
wait_time = min(retry_after, 2 ** attempt * 10)
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
else:
raise Exception("Max retries exceeded")
else:
response.raise_for_status()
raise Exception("Max retries exceeded")
# Usage
result = api_call_with_retry(
'https://api.scriptix.io/api/v3/documents',
{'Authorization': 'Bearer YOUR_API_KEY'}
)
2. Check Remaining Requests
Monitor remaining requests before making calls:
def check_rate_limit(headers):
"""Check rate limit before making request."""
response = requests.get(
'https://api.scriptix.io/api/v3/me',
headers=headers
)
remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
if remaining < 10:
reset = int(response.headers.get('X-RateLimit-Reset', 0))
wait_time = reset - time.time()
print(f"Warning: Only {remaining} requests remaining")
print(f"Limit resets in {wait_time/60:.1f} minutes")
return remaining
# Usage
headers = {'Authorization': 'Bearer YOUR_API_KEY'}
remaining = check_rate_limit(headers)
if remaining > 0:
# Make API calls
pass
3. Request Queuing
Queue requests to stay within limits:
import time
from queue import Queue
from threading import Thread
class RateLimitedAPI:
def __init__(self, api_key, requests_per_minute=60):
self.api_key = api_key
self.requests_per_minute = requests_per_minute
self.min_interval = 60.0 / requests_per_minute
self.last_request = 0
self.queue = Queue()
def make_request(self, url):
"""Add request to queue."""
self.queue.put(url)
def process_queue(self):
"""Process queued requests with rate limiting."""
while True:
if not self.queue.empty():
url = self.queue.get()
# Wait if needed to respect rate limit
elapsed = time.time() - self.last_request
if elapsed < self.min_interval:
time.sleep(self.min_interval - elapsed)
# Make request
response = requests.get(
url,
headers={'Authorization': f'Bearer {self.api_key}'}
)
self.last_request = time.time()
yield response.json()
time.sleep(0.1)
# Usage
api = RateLimitedAPI('YOUR_API_KEY', requests_per_minute=60)
# Queue requests
for i in range(100):
api.make_request(f'https://api.scriptix.io/api/v3/documents/{i}')
# Process queue
for result in api.process_queue():
print(result)
4. Batch Operations
Use batch endpoints to reduce request count:
# ❌ Multiple individual requests (100 requests)
for doc_id in doc_ids:
doc = get_document(doc_id)
# ✅ Single batch request (1 request)
docs = batch_get_documents(doc_ids)
Endpoint-Specific Limits
Some endpoints have additional limits:
Custom Models
| Endpoint | Limit | Window |
|---|---|---|
| Create model | 10 | 1 hour |
| Start training | 5 | 1 hour |
| Upload data | 50 | 1 hour |
| Status checks | 100 | 1 minute |
Batch Transcription
| Endpoint | Limit | Window |
|---|---|---|
| Upload file | 50 | 1 hour |
| TUS upload | 20 | 1 hour |
| Status checks | 200 | 1 minute |
| Result retrieval | 100 | 1 minute |
Real-time API
| Limit Type | Value |
|---|---|
| Concurrent sessions | Plan-dependent |
| Audio chunks/second | 20 |
| Session duration | 4 hours max |
Avoiding Rate Limits
1. Use Webhooks
Instead of polling, use webhooks for notifications:
# ❌ Polling (many requests)
while True:
status = check_status(job_id)
if status == 'completed':
break
time.sleep(5) # Wastes rate limit
# ✅ Webhooks (zero polling requests)
configure_webhook('https://yourapp.com/webhook')
# Receive notification when completed
See Batch API - Webhooks.
2. Cache Responses
Cache responses that don't change frequently:
import redis
from datetime import timedelta
cache = redis.Redis()
def get_document_cached(doc_id):
"""Get document with caching."""
cache_key = f"doc:{doc_id}"
# Check cache
cached = cache.get(cache_key)
if cached:
return json.loads(cached)
# Fetch from API (uses rate limit)
doc = api_get_document(doc_id)
# Cache for 1 hour
cache.setex(cache_key, timedelta(hours=1), json.dumps(doc))
return doc
3. Implement Request Deduplication
Avoid duplicate requests:
from functools import lru_cache
from datetime import datetime
@lru_cache(maxsize=1000)
def get_document_dedupe(doc_id, cache_time):
"""Deduplicate requests within same minute."""
return api_get_document(doc_id)
# Usage - requests in same minute use cache
current_minute = datetime.now().strftime('%Y-%m-%d %H:%M')
doc = get_document_dedupe(123, current_minute)
4. Optimize Pagination
Use larger page sizes to reduce requests:
# ❌ Small pages (many requests)
for page in range(1, 101): # 100 requests
docs = get_documents(page=page, per_page=10)
# ✅ Large pages (fewer requests)
for page in range(1, 11): # 10 requests
docs = get_documents(page=page, per_page=100)
Monitoring Rate Limit Usage
Track Usage Over Time
import time
from collections import deque
class RateLimitMonitor:
def __init__(self, window_seconds=60):
self.window = window_seconds
self.requests = deque()
def record_request(self):
"""Record a request timestamp."""
now = time.time()
self.requests.append(now)
# Remove old requests outside window
cutoff = now - self.window
while self.requests and self.requests[0] < cutoff:
self.requests.popleft()
def get_current_rate(self):
"""Get current requests per minute."""
now = time.time()
cutoff = now - self.window
# Count requests in window
count = sum(1 for t in self.requests if t > cutoff)
return count
def can_make_request(self, max_per_minute):
"""Check if request can be made without exceeding limit."""
return self.get_current_rate() < max_per_minute
# Usage
monitor = RateLimitMonitor(window_seconds=60)
def make_api_call(url):
if monitor.can_make_request(max_per_minute=60):
response = requests.get(url, ...)
monitor.record_request()
return response
else:
print("Rate limit would be exceeded, waiting...")
time.sleep(1)
return make_api_call(url)
Log Rate Limit Info
import logging
def log_rate_limit(response):
"""Log rate limit information from response."""
limit = response.headers.get('X-RateLimit-Limit')
remaining = response.headers.get('X-RateLimit-Remaining')
reset = response.headers.get('X-RateLimit-Reset')
if remaining:
remaining_int = int(remaining)
limit_int = int(limit)
usage_pct = ((limit_int - remaining_int) / limit_int) * 100
logging.info(
f"Rate limit: {remaining}/{limit} remaining ({usage_pct:.1f}% used)"
)
if remaining_int < limit_int * 0.1: # Less than 10% remaining
logging.warning("Rate limit almost exhausted!")
Increasing Rate Limits
Upgrade Plan
Higher plans have increased rate limits:
- Silver: 60 req/min (2x Bronze)
- Gold: 120 req/min (4x Bronze)
- Enterprise: Custom limits
Contact Sales
For Enterprise custom limits:
- Email: sales@scriptix.io
- Include: Expected usage patterns, use case
Error Handling Example
Complete error handling with rate limiting:
import time
import requests
from requests.exceptions import RequestException
def robust_api_call(url, headers, max_retries=5):
"""Make API call with comprehensive error handling."""
for attempt in range(max_retries):
try:
response = requests.get(url, headers=headers, timeout=30)
# Success
if response.status_code == 200:
return response.json()
# Rate limit
elif response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 60))
wait_time = min(retry_after, 300) # Max 5 min
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
continue
# Other errors
else:
response.raise_for_status()
except requests.exceptions.Timeout:
print(f"Timeout on attempt {attempt + 1}")
if attempt < max_retries - 1:
time.sleep(2 ** attempt)
else:
raise
except RequestException as e:
print(f"Request error: {e}")
if attempt < max_retries - 1:
time.sleep(2 ** attempt)
else:
raise
raise Exception(f"Failed after {max_retries} attempts")
Best Practices Summary
- ✅ Respect
Retry-Afterheader - ✅ Implement exponential backoff
- ✅ Use webhooks instead of polling
- ✅ Cache responses when possible
- ✅ Monitor rate limit headers
- ✅ Use batch operations
- ✅ Implement request queuing
- ✅ Deduplicate requests
- ✅ Log rate limit warnings
- ✅ Upgrade plan if needed
Related Documentation
- Authentication - API key management
- Error Codes - Error handling
- Batch API - Webhooks - Webhook setup
Note: Enterprise customers can request custom rate limits. Contact sales@scriptix.io.