LLM Pricing Explained: Understanding the True Cost of AI
Break down how LLM pricing works, what blended price means, and how to estimate your monthly API costs.
How LLM Pricing Works
Most LLM providers charge based on tokens — the fundamental units that models process. A token is roughly 3/4 of a word in English, so 1,000 tokens is about 750 words.
Pricing is quoted per million tokens (1M), with separate rates for:
- Input tokens — the text you send to the model (prompts, context, instructions)
- Output tokens — the text the model generates back
Output tokens are typically 3-5x more expensive than input tokens because generation requires more computation than reading.
Understanding Blended Price
Comparing models with different input/output ratios is confusing. That's why we use blended price — a single number that assumes a typical 3:1 input-to-output ratio:
blended = (3 × input_price + 1 × output_price) / 4
This gives you a realistic per-million-token cost for most applications.
Real-World Cost Examples
Let's say you're running a customer support chatbot that handles 1,000 conversations per day, each averaging 500 input tokens and 200 output tokens:
| Model | Input/1M | Output/1M | Daily Cost | Monthly Cost |
|---|
Fique por dentro
Análise semanal de LLMs direto no seu email. Sem spam.