LLM and AI Model Pricing Calculator - Compare GPT-4, Claude, and Gemini Costs
Filter AI Models
Calculate by:
Input tokens:
Output tokens:
Number of API calls:
Provider | Model | Type | Description | Context | Modality | Qualities | Input (1M) | Output (1M) | Total price ▲ |
---|---|---|---|---|---|---|---|---|---|
![]() OpenAI | General | GPT-4.1 nano is the fastest, most cost-effective GPT-4.1 model. | 1,047,576 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.05 | $0.20 | Input: (84,000 tokens) $0.00 Output: (420,000 tokens) $0.08 Total: $0.09 | |
![]() Google | General | The smallest and most cost effective model, built for at scale usage. | 1,048,576 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.08 | $0.30 | Input: (84,000 tokens) $0.01 Output: (420,000 tokens) $0.13 Total: $0.13 | |
![]() Google | General | The most balanced multimodal model with great performance across all tasks, with a 1 million token context window, and built for the era of Agents. | 1,000,000 | Input: ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.10 | $0.40 | Input: (84,000 tokens) $0.01 Output: (420,000 tokens) $0.17 Total: $0.18 | |
![]() Google | General | The first hybrid reasoning model which supports a 1M token context window and has thinking budgets. Preview model. | 1,000,000 | Input: ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.15 | $0.60 | Input: (84,000 tokens) $0.01 Output: (420,000 tokens) $0.25 Total: $0.26 | |
![]() Cloudflare | General | Building upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks. | 128,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.35 | $0.56 | Input: (84,000 tokens) $0.03 Output: (420,000 tokens) $0.24 Total: $0.26 | |
![]() Cloudflare | General | Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Gemma 3 models are multimodal, handling text and image input and generating text output, with a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions. | 80,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.35 | $0.56 | Input: (84,000 tokens) $0.03 Output: (420,000 tokens) $0.24 Total: $0.26 | |
![]() Llama (AWS) | General | Meta's compact and efficient Llama 3.1 model with 8B parameters. Offers great performance for lightweight applications. Available through AWS Bedrock. | 128,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.45 | $0.70 | Input: (84,000 tokens) $0.04 Output: (420,000 tokens) $0.29 Total: $0.33 | |
![]() Cloudflare | General | Meta's Llama 4 Scout is a 17 billion parameter model with 16 experts that is natively multimodal. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. | 131,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.27 | $0.85 | Input: (84,000 tokens) $0.02 Output: (420,000 tokens) $0.36 Total: $0.38 | |
![]() Cloudflare | Reasoning | QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini. | 24,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.66 | $1.00 | Input: (84,000 tokens) $0.06 Output: (420,000 tokens) $0.42 Total: $0.48 | |
![]() OpenAI | General | GPT-4.1 mini provides a balance between intelligence, speed, and cost that makes it an attractive model for many use cases. | 1,047,576 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.40 | $1.60 | Input: (84,000 tokens) $0.03 Output: (420,000 tokens) $0.67 Total: $0.71 | |
![]() Anthropic | General | Anthropic's fastest model with great performance for diverse tasks. | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.80 | $4.00 | Input: (84,000 tokens) $0.07 Output: (420,000 tokens) $1.68 Total: $1.75 | |
![]() OpenAI | Reasoning | o4-mini is the latest small o-series model. It's optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $1.10 | $4.40 | Input: (84,000 tokens) $0.09 Output: (420,000 tokens) $1.85 Total: $1.94 | |
![]() OpenAI | Reasoning | Small reasoning model with high intelligence. Supports Structured Outputs, function calling, and Batch API. | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $1.10 | $4.40 | Input: (84,000 tokens) $0.09 Output: (420,000 tokens) $1.85 Total: $1.94 | |
![]() DeepSeek (AWS) | Reasoning | DeepSeek's fully managed reasoning-focused LLM with strong capabilities in problem solving, coding, and natural language understanding. Available through AWS Bedrock. | 32,768 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $1.35 | $5.40 | Input: (84,000 tokens) $0.11 Output: (420,000 tokens) $2.27 Total: $2.38 | |
![]() OpenAI | General | OpenAI's multimodal model with advanced audio processing capabilities for both input and output. | 128,000 | Input: ![]() ![]() ![]() Output: ![]() ![]() ![]() | Reasoning Speed Intelligence | $3.00 | $6.00 | Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $2.52 Total: $2.77 | |
![]() OpenAI | General | GPT-4.1 is the flagship model for complex tasks. It is well suited for problem solving across domains. | 1,047,576 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $2.00 | $8.00 | Input: (84,000 tokens) $0.17 Output: (420,000 tokens) $3.36 Total: $3.53 | |
![]() Google | Reasoning | The state-of-the-art multipurpose model, which excels at coding and complex reasoning tasks. Preview model. | 1,048,576 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $1.25 | $10.00 | Input: (84,000 tokens) $0.11 Output: (420,000 tokens) $4.20 Total: $4.31 | |
![]() Llama (AWS) | General | Meta's Llama 3.1 70B parameter model offering strong reasoning capabilities and efficiency. Available through AWS Bedrock with latency-optimized inference. | 128,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $3.50 | $10.50 | Input: (84,000 tokens) $0.29 Output: (420,000 tokens) $4.41 Total: $4.70 | |
![]() OpenAI | Audio | Cost-effective model for text-to-speech. Processes text and generates audio responses. Best for shorter audio outputs (under 2 minutes). | 128,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $0.60 | $12.00 | Input: (84,000 tokens) $0.05 Output: (420,000 tokens) $5.04 Total: $5.09 | |
![]() Anthropic | General | Anthropic's powerful, cost-effective model for complex tasks. | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $3.00 | $15.00 | Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $6.30 Total: $6.55 | |
![]() Anthropic | Reasoning | Anthropic's most intelligent model and the first hybrid reasoning model on the market. | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $3.00 | $15.00 | Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $6.30 Total: $6.55 | |
![]() Anthropic | General | A high-performance model with exceptional reasoning and efficiency | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $3.00 | $15.00 | Input: (84,000 tokens) $0.25 Output: (420,000 tokens) $6.30 Total: $6.55 | |
![]() Llama (AWS) | Reasoning | Meta's largest Llama 3.1 model with 405B parameters. High-performance for complex reasoning and generation tasks. Available through AWS Bedrock with latency-optimized inference. | 128,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $8.00 | $24.00 | Input: (84,000 tokens) $0.67 Output: (420,000 tokens) $10.08 Total: $10.75 | |
![]() OpenAI | Reasoning | Powerful model for math, science, coding and visual tasks. Excels at technical writing and multi-step problems involving text, code, and images. | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $10.00 | $40.00 | Input: (84,000 tokens) $0.84 Output: (420,000 tokens) $16.80 Total: $17.64 | |
![]() Anthropic | Reasoning | Anthropic's most powerful model, designed for the most complex tasks. | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $15.00 | $75.00 | Input: (84,000 tokens) $1.26 Output: (420,000 tokens) $31.50 Total: $32.76 | |
![]() Anthropic | Reasoning | The most capable and intelligent model yet. Claude Opus 4 sets new standards in complex reasoning and advanced coding | 200,000 | Input: ![]() ![]() ![]() ![]() Output: ![]() ![]() ![]() ![]() | Reasoning Speed Intelligence | $15.00 | $75.00 | Input: (84,000 tokens) $1.26 Output: (420,000 tokens) $31.50 Total: $32.76 |