Intelligent Rate Limiting

Go Beyond Simple IP Blocking with Intelligent Rate Limiting

Traditional IP-based rate limiting often falls short in the era of LLM applications. Shared IPs, dynamic addresses, and authenticated user sessions mean you might unfairly block legitimate users or fail to stop targeted abuse from a single authenticated entity. Prompt Shield provides intelligent rate limiting capabilities that understand the context of each request.

Key Capabilities

Identifying the Requester

Flexible Identification: Configure Prompt Shield to identify unique requesters using:
- Authenticated User IDs (e.g., from JWT sub claims)
- API Keys (passed in headers or request bodies)
- Session IDs
- Custom Headers or Request Attributes
- IP Address (as a fallback or specific rule)
Targeted Rules: Apply limits accurately to the entity making the request, not just their network address.

Setting Granular Limits

User-Specific Quotas: Define precise request limits (per minute, hour, day, month) for individual users or API keys.
Tiered Application: Combine with Tiered Usage & Cost Limits to apply different rate limits based on the identified user’s subscription plan or role. (Links conceptually to the other feature)
Fairness: Ensure high-volume legitimate users aren’t penalized due to noisy neighbours on the same IP.

Concurrency Control

Limit Simultaneous Requests: Prevent a single user or service from overwhelming your application resources or hitting downstream LLM API concurrency limits by capping simultaneous connections or function invocations.
Maintain Stability: Ensure your application remains responsive under load.

Basic Input Analysis

Pre-emptive Blocking: Configure basic checks on incoming requests before executing expensive logic or LLM calls.
Block Obvious Misuse: Reject requests with excessively long prompts or unusually large payloads designed solely to maximize token usage or cause errors.
Cost Savings: Prevent wasteful requests from incurring unnecessary processing or LLM API costs.

How It Works

Request Intercepted: Prompt Shield receives the incoming request.
Identifier Extraction: Extracts the configured identifier (User ID, API Key, IP Address, etc.).
Limit Check: Checks the request against the specific rate limits (e.g., RPM, monthly quota) defined for that identifier or its associated tier.
Concurrency Check: Verifies if accepting the request would exceed configured concurrency limits.
Input Analysis (Optional): Performs basic checks on request size/length if configured.
Enforcement: Allows the request to proceed only if all applicable checks pass; otherwise, blocks it.

Benefits

Targeted Protection: Accurately limit specific users or keys, not just IPs.
Improved Fairness: Avoid blocking legitimate users on shared networks.
Enhanced Stability: Prevent resource exhaustion with concurrency limits.
Early Abuse Detection: Block simple, wasteful attacks before they cost you money.
Reduced False Positives: More accurate identification leads to fewer wrongly blocked requests.

Leverage Prompt Shield’s Intelligent Rate Limiting to secure your LLM applications effectively and fairly.