Axon AI enforces rate limits to ensure fair usage and maintain system stability. These limits are applied per API key and are measured across a one-minute window.

Rate Limit Tiers

Different service tiers offer varying rate limits to accommodate different usage patterns and requirements. Below are the rate limits for each tier:
TierRequests per MinuteInput Tokens per MinuteOutput Tokens per MinuteCredits to PurchaseMax Monthly Spend
Default2030,0008,000$5$100
Tier 2200300,00080,000$40$500
Tier 3400600,000160,000$200$1,000

Default Tier

The default tier provides basic rate limits suitable for most development and testing scenarios. This tier is unlocked with $5 credits and has a maximum monthly spend of $100.

Tier 2

Tier 2 offers 10x the default limits, ideal for production applications with moderate usage. This tier is unlocked with $40 credits and has a maximum monthly spend of $500.

Tier 3

Tier 3 provides the highest limits at 20x the default, designed for high-volume production workloads. This tier is unlocked with $200 credits and has a maximum monthly spend of $1,000.

Understanding Rate Limits

  • Requests per Minute: The maximum number of API calls allowed per minute
  • Input Tokens per Minute: The maximum number of tokens that can be sent as input per minute
  • Output Tokens per Minute: The maximum number of tokens that can be generated as output per minute
When a rate limit is exceeded, the API will return a 429 (Too Many Requests) status code. Your application should implement appropriate retry logic with exponential backoff to handle rate limit errors gracefully.