How much does the Claude API cost in 2026?

Anthropic's published rates as of May 2026: Claude Haiku 4.5 is $1 per million input tokens and $5 per million output tokens. Claude Sonnet 4.6 is $3 / $15. Claude Opus 4.6 is $5 / $25. All three models share a 200,000-token context window. Output tokens cost 3-5× more than input tokens because generation is sequential.

What is a token in the Claude API?

A token is a subword unit that the model's tokenizer treats as a single piece. In English, one token is roughly three-quarters of a word, so 1,000 words is about 1,300 tokens. Code and JSON use more tokens per character due to special symbols and short variable names. Punctuation and whitespace also count toward your bill.

What does a typical Claude API message cost?

A standard Sonnet 4.6 message — 860 input tokens (your prompt plus a 500-word article) and 150 output tokens — costs about $0.0048, roughly half a cent. The same call on Haiku 4.5 costs about $0.0016. On Opus 4.6, expect about $0.008. Long-context analysis (5,000-word document) ranges from $0.02 (Haiku) to $0.15 (Opus).

Why does my Claude API bill grow over a conversation?

Conversation history accumulates as input on every turn. By message 10 in a chat, the API receives all 9 prior exchanges plus the system prompt as input tokens — potentially 10,000+ tokens per call even if your latest message is short. Per-message cost can grow 5-10× across a long thread. Start fresh chats for unrelated topics to control spend.

How do I reduce Claude API costs?

Four effective levers: (1) match the model to the task — Haiku for simple work cuts 80% of cost vs Opus; (2) start new conversations for new topics so history does not pile up; (3) write specific prompts so the model produces shorter, focused output; (4) trim context before sending — extract the relevant section rather than the whole document.

Claude API Pricing Explained: Tokens, Costs & 4 Ways to Cut Spend

If you have ever looked at Anthropic's API pricing page, you have probably encountered terms like "MTok," "input tokens," and "output tokens" without a clear sense of what they mean for your actual bill. This guide breaks down exactly how Claude API pricing works, what tokens are, how costs are calculated per message, and how tools like Prophet let you access the API without managing any of this complexity yourself.

What Are Tokens?

Tokens are the fundamental unit that language models use to process text. A token is not exactly a word: it is a chunk of text that the model's tokenizer recognizes as a single unit. In English, one token is roughly three-quarters of a word. A 1,000-word article is approximately 1,300 tokens. A short email might be 100-200 tokens. A full-length novel is roughly 100,000 tokens.

The reason AI companies price by token rather than by word or message is that tokens directly correspond to computational cost. Processing 1,000 tokens requires a specific amount of GPU compute regardless of whether those tokens form coherent sentences or random characters. Pricing by token ensures that the cost reflects the actual resources consumed.

How Tokenization Works

Anthropic uses a tokenizer that breaks text into subword units. Common words like "the" or "and" are single tokens. Less common words get split into pieces: "tokenization" might become "token" + "ization" (two tokens). Numbers, punctuation, and special characters each consume tokens as well. Whitespace and formatting also count.

This means that code (which contains many special characters and short variable names) tends to use more tokens per useful character than natural language. JSON data structures are particularly token-hungry due to their braces, quotes, and colons. Keep this in mind when estimating costs for code-heavy or data-heavy workflows.

Input Tokens vs Output Tokens

Every API call has two token counts that matter for pricing:

Input tokens are everything you send to the model: your message, the system prompt, any conversation history, and any context (like a web page's content). The more context you provide, the more input tokens you consume.

Output tokens are everything the model generates in response. A short one-sentence answer might be 20 output tokens. A detailed analysis could be 2,000 output tokens. You control this partly through your prompt (asking for a "brief" response versus a "comprehensive" one) and partly through the max_tokens parameter.

Output tokens are significantly more expensive than input tokens across all Claude models. This is because generating each output token requires a full forward pass through the model, while input tokens can be processed in parallel. The cost ratio varies by model but is typically 3-5x.

Current Claude API Pricing

Anthropic prices by MTok, which means "per million tokens." Here are the current rates:

Model	Input (per MTok)	Output (per MTok)	Context Window
Claude Haiku 4.5	$1.00	$5.00	200K tokens
Claude Sonnet 4.6	$3.00	$15.00	200K tokens
Claude Opus 4.6	$5.00	$25.00	200K tokens

To convert MTok pricing to per-token pricing, divide by 1,000,000. For example, Claude Sonnet input costs $3.00 / 1,000,000 = $0.000003 per token. Not particularly intuitive, which is why thinking in terms of message cost is more practical.

Anthropic API Free Tier Credits (2026)

Anthropic's API does not have a recurring free tier in 2026 — you pay per token from the first call, with a $5 minimum deposit on console.anthropic.com. New accounts occasionally receive a small one-time evaluation credit (historically $5), but this is not guaranteed and is not advertised as a free tier. For free Claude API access via a third party, AWS Bedrock and Google Cloud Vertex AI new-account credits both apply to Claude calls, and Prophet's $0.20 free credit covers around 100 Sonnet 4.6 messages with no card required. Treat these as one-time grants rather than ongoing free quotas.

What Does a Typical Message Cost?

Let us walk through a real example. You paste a 500-word article (about 650 input tokens) into Claude Sonnet and ask it to "summarize this in three bullet points" (about 10 more input tokens). The system prompt adds another 200 tokens. Claude responds with a summary of about 150 tokens.

Input tokens: 860
Output tokens: 150
Input cost: (860 / 1,000,000) x $3.00 = $0.00258
Output cost: (150 / 1,000,000) x $15.00 = $0.00225
Total: $0.00483 (about half a cent)

A more complex interaction, where you provide a 5,000-word document and ask for a detailed analysis, might cost 10-15 cents with Opus. The cost scales linearly with token count.

Conversation History Multiplies Costs

One aspect of API pricing that surprises new users is that conversation history accumulates. When you send the fifth message in a conversation, the API receives all previous messages as input tokens. A conversation with ten back-and-forth exchanges might have 10,000 input tokens by the final message, even if each individual message was short.

This means that long conversations get progressively more expensive per message. The first message might cost 1 cent, the fifth message 3 cents, and the twentieth message 10 cents, even if your actual text is the same length each time. Managing conversation length is one of the most effective ways to control API costs.

How Prophet Simplifies API Access

Using the Claude API directly requires creating an Anthropic account, generating API keys, managing billing, writing code to make API calls, handling errors, and implementing streaming. Prophet eliminates all of this complexity.

When you use Prophet, here is what happens behind the scenes:

You type a message in the browser side panel
Prophet sends it to its backend API with your authentication token
The backend forwards the request to Anthropic's API using Prophet's own API key
The response streams back through Prophet's backend to your browser
Prophet calculates the token cost and deducts it from your credit balance

You never touch an API key. You never see a raw API response. You never deal with token counting or cost calculation. Prophet handles it all and shows you a simple credit balance that depletes as you use the service. One credit equals one cent, so if a message costs 2 credits, it cost 2 cents.

Visit our pricing page to see how Prophet's credit tiers map to different usage levels.

Strategies to Reduce API Costs

Choose the Right Model

The single biggest lever for cost reduction is model selection. Haiku costs 5x less than Opus per input token and 5x less per output token. For simple tasks like grammar correction, format conversion, or factual lookups, Haiku produces output that is virtually identical to Opus. Reserve Opus for tasks that genuinely require deep reasoning.

Keep Conversations Short

Start new conversations for new topics instead of continuing old ones. A fresh conversation has minimal input tokens. A long-running conversation sends the entire history with every message, multiplying costs. In Prophet, creating a new chat is one click.

Be Specific in Your Prompts

Vague prompts produce long, rambling responses that consume more output tokens. Specific prompts produce focused responses. "Summarize this in three bullet points" costs less than "Tell me about this article" because the model generates fewer tokens to satisfy the request.

Trim Context Before Sending

If you are analyzing a web page, you do not always need the entire page content. Prophet's accessibility-tree approach already filters out non-essential elements like ads and navigation. But you can further reduce costs by asking about specific sections rather than the whole page.

API Pricing vs Subscription Pricing

The main alternative to API-based pricing is a flat subscription like Claude Pro at $20/month. The subscription gives you rate-limited access without per-message costs. API-based access through Prophet gives you precise cost control and no rate limits beyond basic server protection.

For users who send fewer than 1,000 messages per month with Sonnet, Prophet's Pro plan at $9.99/month is more economical than Claude Pro. For users who send more than 2,000 Sonnet-equivalent messages per month, Claude Pro's flat rate becomes the better deal. The crossover point depends on your model mix and message length.

Key Takeaways

Tokens are the fundamental pricing unit, and one token is roughly three-quarters of a word. Output tokens cost 3-5x more than input tokens. Conversation history accumulates input tokens with every message. The most effective cost reduction strategy is choosing the right model for each task. Prophet abstracts away all the complexity of API billing into a simple credit system where one credit equals one cent, and you can see exactly what each conversation costs.