Rate Limits

API requests are rate-limited to ensure platform stability and fair usage.

Rate Limit Tiers

Limits vary by subscription plan:

Plan	Requests/Minute	Requests/Hour	Requests/Day	Concurrent
Free	10	100	1,000	2
Bronze	30	500	5,000	5
Silver	60	2,000	20,000	10
Gold	120	5,000	50,000	20
Enterprise	Custom	Custom	Custom	Custom

Concurrent: Maximum simultaneous requests allowed.

Rate Limit Headers

Every API response includes rate limit information in headers:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1642089600
Content-Type: application/json

Header Definitions

Header	Description	Example
`X-RateLimit-Limit`	Maximum requests allowed in window	`100`
`X-RateLimit-Remaining`	Requests remaining in current window	`95`
`X-RateLimit-Reset`	Unix timestamp when limit resets	`1642089600`

Reading Headers

import requests
from datetime import datetime

response = requests.get(
    'https://api.scriptix.io/api/v3/documents',
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)

limit = int(response.headers.get('X-RateLimit-Limit', 0))
remaining = int(response.headers.get('X-RateLimit-Remaining', 0))
reset = int(response.headers.get('X-RateLimit-Reset', 0))

print(f"Limit: {limit}")
print(f"Remaining: {remaining}")
print(f"Resets at: {datetime.fromtimestamp(reset)}")

Rate Limit Exceeded

When you exceed rate limits, you receive a 429 Too Many Requests response:

HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/json

{
  "error": "Rate Limit Exceeded",
  "message": "Too many requests. Please try again in 60 seconds.",
  "error_code": "RATE_LIMIT_EXCEEDED",
  "retry_after": 60
}

Response Fields

Field	Description
`error`	Error type
`message`	Human-readable description
`error_code`	Machine-readable code
`retry_after`	Seconds until limit resets

Retry-After Header

The Retry-After header indicates seconds to wait:

Retry-After: 60

Handling Rate Limits

1. Exponential Backoff

Implement exponential backoff with retry logic:

import time
import requests

def api_call_with_retry(url, headers, max_retries=5):
    """Make API call with exponential backoff retry."""

    for attempt in range(max_retries):
        response = requests.get(url, headers=headers)

        if response.status_code == 200:
            return response.json()

        elif response.status_code == 429:
            # Rate limit exceeded
            retry_after = int(response.headers.get('Retry-After', 60))

            if attempt < max_retries - 1:
                wait_time = min(retry_after, 2 ** attempt * 10)
                print(f"Rate limited. Waiting {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise Exception("Max retries exceeded")

        else:
            response.raise_for_status()

    raise Exception("Max retries exceeded")

# Usage
result = api_call_with_retry(
    'https://api.scriptix.io/api/v3/documents',
    {'Authorization': 'Bearer YOUR_API_KEY'}
)

2. Check Remaining Requests

Monitor remaining requests before making calls:

def check_rate_limit(headers):
    """Check rate limit before making request."""
    response = requests.get(
        'https://api.scriptix.io/api/v3/me',
        headers=headers
    )

    remaining = int(response.headers.get('X-RateLimit-Remaining', 0))

    if remaining < 10:
        reset = int(response.headers.get('X-RateLimit-Reset', 0))
        wait_time = reset - time.time()
        print(f"Warning: Only {remaining} requests remaining")
        print(f"Limit resets in {wait_time/60:.1f} minutes")

    return remaining

# Usage
headers = {'Authorization': 'Bearer YOUR_API_KEY'}
remaining = check_rate_limit(headers)

if remaining > 0:
    # Make API calls
    pass

3. Request Queuing

Queue requests to stay within limits:

import time
from queue import Queue
from threading import Thread

class RateLimitedAPI:
    def __init__(self, api_key, requests_per_minute=60):
        self.api_key = api_key
        self.requests_per_minute = requests_per_minute
        self.min_interval = 60.0 / requests_per_minute
        self.last_request = 0
        self.queue = Queue()

    def make_request(self, url):
        """Add request to queue."""
        self.queue.put(url)

    def process_queue(self):
        """Process queued requests with rate limiting."""
        while True:
            if not self.queue.empty():
                url = self.queue.get()

                # Wait if needed to respect rate limit
                elapsed = time.time() - self.last_request
                if elapsed < self.min_interval:
                    time.sleep(self.min_interval - elapsed)

                # Make request
                response = requests.get(
                    url,
                    headers={'Authorization': f'Bearer {self.api_key}'}
                )

                self.last_request = time.time()
                yield response.json()

            time.sleep(0.1)

# Usage
api = RateLimitedAPI('YOUR_API_KEY', requests_per_minute=60)

# Queue requests
for i in range(100):
    api.make_request(f'https://api.scriptix.io/api/v3/documents/{i}')

# Process queue
for result in api.process_queue():
    print(result)

4. Batch Operations

Use batch endpoints to reduce request count:

# ❌ Multiple individual requests (100 requests)
for doc_id in doc_ids:
    doc = get_document(doc_id)

# ✅ Single batch request (1 request)
docs = batch_get_documents(doc_ids)

Endpoint-Specific Limits

Some endpoints have additional limits:

Custom Models

Endpoint	Limit	Window
Create model	10	1 hour
Start training	5	1 hour
Upload data	50	1 hour
Status checks	100	1 minute

Batch Transcription

Endpoint	Limit	Window
Upload file	50	1 hour
TUS upload	20	1 hour
Status checks	200	1 minute
Result retrieval	100	1 minute

Real-time API

Limit Type	Value
Concurrent sessions	Plan-dependent
Audio chunks/second	20
Session duration	4 hours max

Avoiding Rate Limits

1. Use Webhooks

Instead of polling, use webhooks for notifications:

# ❌ Polling (many requests)
while True:
    status = check_status(job_id)
    if status == 'completed':
        break
    time.sleep(5)  # Wastes rate limit

# ✅ Webhooks (zero polling requests)
configure_webhook('https://yourapp.com/webhook')
# Receive notification when completed

See Batch API - Webhooks.

2. Cache Responses

Cache responses that don't change frequently:

import redis
from datetime import timedelta

cache = redis.Redis()

def get_document_cached(doc_id):
    """Get document with caching."""
    cache_key = f"doc:{doc_id}"

    # Check cache
    cached = cache.get(cache_key)
    if cached:
        return json.loads(cached)

    # Fetch from API (uses rate limit)
    doc = api_get_document(doc_id)

    # Cache for 1 hour
    cache.setex(cache_key, timedelta(hours=1), json.dumps(doc))

    return doc

3. Implement Request Deduplication

Avoid duplicate requests:

from functools import lru_cache
from datetime import datetime

@lru_cache(maxsize=1000)
def get_document_dedupe(doc_id, cache_time):
    """Deduplicate requests within same minute."""
    return api_get_document(doc_id)

# Usage - requests in same minute use cache
current_minute = datetime.now().strftime('%Y-%m-%d %H:%M')
doc = get_document_dedupe(123, current_minute)

4. Optimize Pagination

Use larger page sizes to reduce requests:

# ❌ Small pages (many requests)
for page in range(1, 101):  # 100 requests
    docs = get_documents(page=page, per_page=10)

# ✅ Large pages (fewer requests)
for page in range(1, 11):  # 10 requests
    docs = get_documents(page=page, per_page=100)

Monitoring Rate Limit Usage

Track Usage Over Time

import time
from collections import deque

class RateLimitMonitor:
    def __init__(self, window_seconds=60):
        self.window = window_seconds
        self.requests = deque()

    def record_request(self):
        """Record a request timestamp."""
        now = time.time()
        self.requests.append(now)

        # Remove old requests outside window
        cutoff = now - self.window
        while self.requests and self.requests[0] < cutoff:
            self.requests.popleft()

    def get_current_rate(self):
        """Get current requests per minute."""
        now = time.time()
        cutoff = now - self.window

        # Count requests in window
        count = sum(1 for t in self.requests if t > cutoff)

        return count

    def can_make_request(self, max_per_minute):
        """Check if request can be made without exceeding limit."""
        return self.get_current_rate() < max_per_minute

# Usage
monitor = RateLimitMonitor(window_seconds=60)

def make_api_call(url):
    if monitor.can_make_request(max_per_minute=60):
        response = requests.get(url, ...)
        monitor.record_request()
        return response
    else:
        print("Rate limit would be exceeded, waiting...")
        time.sleep(1)
        return make_api_call(url)

Log Rate Limit Info

import logging

def log_rate_limit(response):
    """Log rate limit information from response."""
    limit = response.headers.get('X-RateLimit-Limit')
    remaining = response.headers.get('X-RateLimit-Remaining')
    reset = response.headers.get('X-RateLimit-Reset')

    if remaining:
        remaining_int = int(remaining)
        limit_int = int(limit)
        usage_pct = ((limit_int - remaining_int) / limit_int) * 100

        logging.info(
            f"Rate limit: {remaining}/{limit} remaining ({usage_pct:.1f}% used)"
        )

        if remaining_int < limit_int * 0.1:  # Less than 10% remaining
            logging.warning("Rate limit almost exhausted!")

Increasing Rate Limits

Upgrade Plan

Higher plans have increased rate limits:

Silver: 60 req/min (2x Bronze)
Gold: 120 req/min (4x Bronze)
Enterprise: Custom limits

Contact Sales

For Enterprise custom limits:

Email: sales@scriptix.io
Include: Expected usage patterns, use case

Error Handling Example

Complete error handling with rate limiting:

import time
import requests
from requests.exceptions import RequestException

def robust_api_call(url, headers, max_retries=5):
    """Make API call with comprehensive error handling."""

    for attempt in range(max_retries):
        try:
            response = requests.get(url, headers=headers, timeout=30)

            # Success
            if response.status_code == 200:
                return response.json()

            # Rate limit
            elif response.status_code == 429:
                retry_after = int(response.headers.get('Retry-After', 60))
                wait_time = min(retry_after, 300)  # Max 5 min

                print(f"Rate limited. Waiting {wait_time}s...")
                time.sleep(wait_time)
                continue

            # Other errors
            else:
                response.raise_for_status()

        except requests.exceptions.Timeout:
            print(f"Timeout on attempt {attempt + 1}")
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)
            else:
                raise

        except RequestException as e:
            print(f"Request error: {e}")
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)
            else:
                raise

    raise Exception(f"Failed after {max_retries} attempts")

Best Practices Summary

✅ Respect Retry-After header
✅ Implement exponential backoff
✅ Use webhooks instead of polling
✅ Cache responses when possible
✅ Monitor rate limit headers
✅ Use batch operations
✅ Implement request queuing
✅ Deduplicate requests
✅ Log rate limit warnings
✅ Upgrade plan if needed

Authentication - API key management
Error Codes - Error handling
Batch API - Webhooks - Webhook setup

Note: Enterprise customers can request custom rate limits. Contact sales@scriptix.io.

Rate Limit Tiers​

Rate Limit Headers​

Header Definitions​

Reading Headers​

Rate Limit Exceeded​

Response Fields​

Retry-After Header​

Handling Rate Limits​

1. Exponential Backoff​

2. Check Remaining Requests​

3. Request Queuing​

4. Batch Operations​

Endpoint-Specific Limits​

Custom Models​

Batch Transcription​

Real-time API​

Avoiding Rate Limits​

1. Use Webhooks​

2. Cache Responses​

3. Implement Request Deduplication​

4. Optimize Pagination​

Monitoring Rate Limit Usage​

Track Usage Over Time​

Log Rate Limit Info​

Increasing Rate Limits​

Upgrade Plan​

Contact Sales​

Error Handling Example​

Best Practices Summary​

Related Documentation​