Python SDK

The Python SDK wraps requests and httpx sessions to passively capture outgoing API calls. Requires Python 3.9+.

Installation

pip install quotawatch

Quick example

main.pypython
from quotawatch import QuotaWatch
from quotawatch.types import QuotaWatchConfig, ApiConfig, ApiLimits

# Initialize once at startup
QuotaWatch.init(QuotaWatchConfig(
    api_key="qw_live_...",
    environment="production",
    apis=[
        ApiConfig(
            name="OpenAI",
            base_url="https://api.openai.com",
            limits=ApiLimits(
                requests_per_minute=60,
                requests_per_day=10_000,
            ),
        ),
    ],
))

# Your existing code — unchanged
import requests
response = requests.post(
    "https://api.openai.com/v1/chat/completions",
    headers={"Authorization": f"Bearer {OPENAI_API_KEY}"},
    json={"model": "gpt-4o", "messages": []},
)

How it works

On init(), the SDK wraps the requests and httpx send methods. A background daemon thread flushes buffered events to the ingest API every 5 seconds.

💡
The background thread is a daemon thread — it won't prevent your application from exiting. Call QuotaWatch.get_instance().flush() before shutdown for guaranteed delivery.

Supported HTTP clients

ClientSupportedNotes
requestsAuto-wrapped on init
httpx (sync)Auto-wrapped on init
httpx (async)Auto-wrapped on init
aiohttp🔜v1.1 planned
urllibNot supported

What gets captured

{
  "api": "OpenAI",
  "endpoint": "/v1/chat/completions",
  "method": "POST",
  "status": 200,
  "latencyMs": 847,
  "timestamp": "2026-05-07T12:00:00.000Z",
  "environment": "production",
  "hit429": false,
  "rateLimitHeaders": {
    "x-ratelimit-remaining-requests": "43",
    "x-ratelimit-reset-requests": "2026-05-07T12:01:00Z"
  }
}
⚠️
No request or response bodies are ever captured. Only metadata: URL path, method, status code, latency, and rate limit response headers.

API reference

QuotaWatch.init(config)

Class method. Initializes the SDK and starts the background flush thread. Thread-safe — safe to call from multiple threads, only initializes once.

ParameterTypeDefaultDescription
api_keystrRequired. Your project API key.
apislist[ApiConfig]Required. APIs to monitor.
environmentstr'production'Tag events by environment.
ingest_urlstr'https://ingest.quotawatch.app'Where to send events. Set to your ingest service URL (e.g. http://localhost:3001 for local dev).
buffer_sizeint500Max events to buffer before dropping oldest.
flush_interval_secondsfloat5.0How often to flush buffered events.

Manual recording (advanced)

The SDK auto-intercepts requests and httpx — you don't need to call record() for those. Use it only when calling APIs through unpatched clients (e.g. aiohttp, vendor SDKs with bundled HTTP transports).

from quotawatch import QuotaWatch
import time
from datetime import datetime, timezone

# Only needed for unpatched clients (aiohttp, vendor SDKs, etc.)
# Do NOT use this if you're already using requests or httpx — they're auto-intercepted.

qw = QuotaWatch.get_instance()
if qw:
    qw.record({
        "api": "MyAPI",
        "endpoint": "/v1/resource",
        "method": "POST",
        "status": 200,           # int, not statusCode
        "latencyMs": 142,
        "timestamp": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.") +                      f"{datetime.now(timezone.utc).microsecond // 1000:03d}Z",
        "environment": "production",
        "hit429": False,
        "rateLimitHeaders": {},
    })

Graceful shutdown

import atexit
from quotawatch import QuotaWatch

# Register flush on exit
atexit.register(lambda: QuotaWatch.get_instance() and QuotaWatch.get_instance().flush())

# Or with FastAPI lifespan
from contextlib import asynccontextmanager
from fastapi import FastAPI

@asynccontextmanager
async def lifespan(app: FastAPI):
    yield
    instance = QuotaWatch.get_instance()
    if instance:
        instance.flush()

app = FastAPI(lifespan=lifespan)

Django integration

settings.pypython
# At the bottom of settings.py
from quotawatch import QuotaWatch
from quotawatch.types import QuotaWatchConfig, ApiConfig, ApiLimits

QuotaWatch.init(QuotaWatchConfig(
    api_key=env("QUOTAWATCH_API_KEY"),
    environment=env("DJANGO_ENV", default="production"),
    apis=[
        ApiConfig(
            name="OpenAI",
            base_url="https://api.openai.com",
            limits=ApiLimits(requests_per_day=10_000),
        ),
    ],
))