Claude API Tier Limits

Anthropic automatically upgrades your API account through usage tiers as your cumulative credit purchases increase. Each tier unlocks higher rate limits — more requests per minute, more input tokens per minute, and more output tokens per minute.

Rate limits are enforced at the organisation level, measured separately per model. You can monitor your current tier and live usage in Anthropic Console → Limits.

Spend tiers

Advancing to the next tier requires a cumulative credit purchase that meets the threshold below. You advance immediately upon reaching it.

Tier	Target audience	How to qualify
Start	Evaluation and small pilots	Automatic upon initial credit purchase
Build	Growing teams with regular PR activity	Based on consistent usage and payments
Scale	Large organizations with high PR velocity	High sustained usage and spend
Custom	Enterprise scale	Contact Anthropic sales

Credit purchase is the cumulative amount deposited into your Anthropic account (excluding tax), not your total spend. You advance as soon as you cross the threshold — there is no waiting period.

Rate limits by tier

Rate limits are measured in three dimensions:

RPM — requests per minute
ITPM — uncached input tokens per minute (cached tokens do not count for most models)
OTPM — output tokens per minute

Start Tier

Entry-level limits for evaluation.

Model	RPM	ITPM	OTPM
Claude Sonnet 4.x	50	40,000	10,000
Claude Haiku 4.5	50	50,000	10,000

Build Tier

Suitable for small to medium teams with moderate PR volume.

Model	RPM	ITPM	OTPM
Claude Sonnet 4.x	2,000	1,000,000	200,000
Claude Haiku 4.5	2,000	2,000,000	400,000

Scale Tier

Suitable for large organisations with high PR velocity.

Model	RPM	ITPM	OTPM
Claude Sonnet 4.x	4,000	4,000,000	800,000
Claude Haiku 4.5	4,000	8,000,000	1,600,000

Sonnet 4.x limits apply to combined traffic across all 4.x models. Exact limits scale dynamically based on your usage patterns.

How ITPM works (cache-aware)

A key advantage of the Claude API is that cached input tokens do not count towards your ITPM rate limit on current models. Only uncached input tokens and tokens being written to cache consume your ITPM quota.

Token type	Counts towards ITPM?
Uncached input tokens	Yes
Cache creation tokens	Yes
Cache read tokens	No (current models)
Output tokens	Counted under OTPM separately

This means effective throughput is significantly higher than the raw ITPM number suggests. If 80% of your input tokens are served from cache, you can process 5× more total input tokens per minute than your ITPM limit implies.

Garth uses prompt caching for system instructions and repository context. Teams with many similar PRs (e.g. same monorepo) benefit automatically from high cache hit rates, increasing effective throughput without needing a higher tier.

Choosing the right tier for your team

Start Tier — Evaluation or very small teams

Fewer than 10 developers, infrequent PRs, or you are trialling BYOK before committing. Active immediately upon funding your account.

Build Tier — Small to medium teams

10–50 developers with regular PR activity. The higher RPM and ITPM limits handle dozens of concurrent reviews comfortably.

Scale Tier — Large engineering orgs

50+ developers, monorepos, or CI pipelines generating a high volume of short-lived PRs.

Token consumption per review

Each Garth review consumes input tokens (your diff and context) and output tokens (the review comments). Estimates below are for Claude Sonnet 4.x.

Review size	Typical input tokens	Typical output tokens
Small PR (1–5 files, < 200 lines)	3,000 – 8,000	500 – 1,500
Medium PR (5–20 files, 200–800 lines)	8,000 – 25,000	1,500 – 4,000
Large PR (20+ files, 800+ lines)	25,000 – 80,000	4,000 – 10,000

Keeping PRs focused (under 400 lines changed) reduces token usage per review and generally improves comment accuracy.

Rate limit errors

If your account hits a rate limit, Garth retries automatically with exponential backoff. Sustained limits (e.g. many large PRs merging simultaneously) may delay review posting. What you will see:

A delayed review comment once the retry succeeds
A dashboard notification if retries are exhausted and the review is dropped

How to resolve:

Advance to the next tier by purchasing additional credits in the Anthropic Console
Enable or increase prompt caching to reduce ITPM consumption
Contact support if you need help sizing the right tier

Getting an Anthropic API key

Create an Anthropic account

Go to console.anthropic.com and sign up or log in.

Purchase credits

Navigate to Billing and purchase at least $5 in credits. This activates Tier 1 immediately. Purchase $40 total to reach Tier 2, $200 for Tier 3, or $400 for Tier 4.

Generate an API key

Go to Settings → API Keys in the Anthropic Console and click Create Key. Give it a descriptive name such as garth-byok.

Copy and store the key

Copy the key — it is shown only once. Paste it into Garth’s Settings → Integrations → LLM.

Never commit your Anthropic API key to a repository. Garth’s secure vault is the correct place to store it. Anthropic’s secret scan — and Garth’s own code scan — will flag any key found in source code.

Provider references

Official Anthropic documentation for models, pricing, and rate limits.

Claude models overview

Full list of available Claude models with context windows and capabilities.

API rate limits

Official rate limit reference for all tiers and models.

Pricing

Per-token pricing for all Claude models.

Anthropic Console — API keys

Create and manage your Anthropic API keys.

Next steps

BYOK overview

Add your Claude API key to Garth’s secure vault and set it as your active provider.

Anthropic Console — Limits

View your current tier and live rate limit usage in the Anthropic Console.

​Spend tiers

​Rate limits by tier

​Start Tier

​Build Tier

​Scale Tier

​How ITPM works (cache-aware)

​Choosing the right tier for your team

Start Tier — Evaluation or very small teams

Build Tier — Small to medium teams

Scale Tier — Large engineering orgs

​Token consumption per review

​Rate limit errors

​Getting an Anthropic API key

​Provider references

Claude models overview

API rate limits

Pricing

Anthropic Console — API keys

​Next steps

BYOK overview

Anthropic Console — Limits

Spend tiers

Rate limits by tier

Start Tier

Build Tier

Scale Tier

How ITPM works (cache-aware)

Choosing the right tier for your team

Token consumption per review

Rate limit errors

Getting an Anthropic API key

Provider references

Next steps