Fire a small, capped, paced burst of requests at any URL or API endpoint and measure how the response handles agent-style traffic. Reports 429 hygiene: Retry-After presence, X-RateLimit-* coverage, structured error body shape. Helps you find the rate-limit retry-storm failure mode before AI agents do.
Test endpoints you own or have permission to probe. Burst is capped at 10 requests, paced at 250-1000ms apart, and each request times out at 5s — this is a measurement, not a stress test.
Documented 2026 failure pattern: agents repeat-hammer rate-limited endpoints, miss Retry-After guidance, fail to fall back to a backup model, and burn auth cooldowns in 30-60 minute loops. Token-based rate limiting + structured 429 bodies + clear retry guidance is the modern fix.
RFC 9110 §10.2.3. Either an HTTP-date or seconds-from-now. When a 429 fires without it, agents have to guess the wait time — usually with exponential backoff that's much slower than your actual reset window. A 30-second Retry-After is dramatically better than agent-side guessing.
Conventional (no IETF spec yet): X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset. Some hosts use RateLimit (no X-) per IETF Internet-Draft. Either flavor lets a well-behaved agent self-throttle BEFORE hitting the 429.
When a 4xx/5xx fires, the response body should be parseable JSON with a recognizable error structure (RFC 7807 Problem Details, or a vendor convention like {error: {code, message, retry_after}}). HTML error pages force the agent to scrape, which is where most retry-storm meltdowns start.
This tool fires up to 10 requests at the URL you specify, paced at the delay you set. It is intended for probing endpoints you own or have permission to test. Hard caps prevent it from being used as a stress tester. If you operate the target, run this to characterize your 429 behavior; if you do not, ask first.