Amazon Bedrock has quietly become the default control plane for enterprise generative AI. What started in 2023 as a curated API for six foundation models has expanded into a platform that now gives developers access to more than 100 models from 18+ providers, all through a single, unified API backed by AWS-grade security and compliance.
If you evaluated Bedrock a year or two ago and moved on, it is worth another look. The 2026 version is a fundamentally different product: Amazon Nova 2, Bedrock AgentCore, OpenAI frontier models on Bedrock, and a pricing model that now spans on-demand, batch (50% off), and provisioned throughput. This guide covers everything you need to make a confident build or buy decision.
AWS Bedrock is a fully managed service that provides API access to high-performing foundation models (FMs) from Amazon and leading third-party AI companies, all through a single, consistent interface. Instead of managing GPU infrastructure, handling model versioning, or stitching together your own inference pipeline, Bedrock gives you a single control plane to:
The key distinction: Bedrock is not a model. It is a platform. The model you choose is a configuration decision; the security, scalability, and observability layer is always Bedrock.
The model catalogue has expanded dramatically. Below is a categorised overview of what is available as of mid-2026.
Amazon's native model family is now in its second generation and covers the broadest modality span on the platform.
|
Text Architecture
Nova Micro128K Context
Optimized for high-speed, low-cost classification processing and intelligent routing execution. |
Multimodal Core
Nova Lite300K Context
Engineered to handle parallel text, image, and video operations at lightning enterprise speed. |
Advanced Reasoning
Nova Pro300K Context
Tailored for highly complex enterprise reasoning schemas and multi-layered processing data workflows. |
|
Deep Analytics
Nova Premier1M Context
Built explicitly for extensive long-document structural analysis and advanced context-heavy reasoning frameworks. |
Next-Gen Multimodal
Nova 2 Lite1M Context
Current-generation multimodal architecture engineered with massive 64K sustained output capacity. |
Voice Intelligence
Nova 2 SonicVariable Context
Designed for real-time voice AI pipelines, processing immediate speech inputs into structural text outputs. |
|
Vector Compute
Nova 2 EmbeddingsN/A Context
Powers highly unified structural search indexes across sprawling text, image, and video databases natively. |
Nova Micro starts at $0.035 per million input tokens, making it one of the most cost-efficient models for routing, classification, and structured extraction tasks.
The full Claude 3.x and Claude 4 series are available on Bedrock, including the latest Claude Opus 4.7, which is the first Bedrock Claude with a 1M context window and 128K max output.
In a significant April 2026 announcement, AWS and OpenAI expanded their partnership to bring frontier OpenAI models to Bedrock. GPT-5.5, GPT-5.4, and Codex are now accessible through the same Bedrock APIs, inheriting AWS IAM, PrivateLink, Guardrails, encryption, and CloudTrail logging. This means enterprises can use OpenAI's top models without leaving their existing AWS security posture.
Llama 4 Maverick and Scout are fully managed on Bedrock with Tool Use support, extending Meta's open-weight model family into production-grade agentic workflows.
A single API endpoint lets you switch between models without rebuilding integrations. The Bedrock Converse API provides a model-agnostic interface, and Intelligent Prompt Routing automatically selects the most cost-effective model capable of handling your request.
Bedrock Agents lets you build autonomous agents that can plan, use tools, retrieve information, and take multi-step actions. AgentCore (launched in 2025 and reaching GA in 2026) extends this into production-grade agentic infrastructure with an Agent Registry, Payments integration, and managed orchestration for complex multi-agent pipelines.
Connect your enterprise data (including documents, databases, and S3 buckets) to any Bedrock model. Knowledge Bases handles chunking, embedding, vector storage, and retrieval automatically. You pay for the tokens processed during retrieval; no separate vector database management is required.
Apply content filters, topic denials, PII redaction, and hallucination detection across any model on Bedrock, regardless of the provider. Guardrails work as a consistent safety layer that sits between your application and the model.
Fine-tune Titan, Claude, and other models with your labelled data. Alternatively, import your own open-weight model via Custom Model Import and run it under Bedrock's managed infrastructure.
Cache frequently reused prompt segments and pay a 90% discount on cached tokens. For applications with long system prompts or shared context, prompt caching is one of the fastest ways to cut costs significantly.
Submit large jobs for asynchronous processing and receive a flat 50% discount vs. on-demand rates. Results are delivered within 24 hours. Ideal for report generation, content enrichment pipelines, and offline analysis tasks.
All Bedrock workloads inherit: IAM access control, AWS PrivateLink (no traffic over the public internet), KMS encryption at rest and in transit, and CloudTrail audit logging. Your data is never used to train provider models.
AWS Bedrock offers three billing modes. Choosing the right one for your workload is the single biggest lever on your monthly bill.
No commitments. You pay per 1,000 tokens (input + output) processed. This is the default for most teams and the right choice for variable or unpredictable traffic.
Representative on-demand rates as of mid-2026 (us-east-1, per 1M tokens):
|
Amazon Web Services
Nova Micro
Input
$0.035
Output
$0.14
|
Amazon Web Services
Nova Lite
Input
$0.06
Output
$0.24
|
Amazon Web Services
Nova Pro
Input
$0.80
Output
$3.20
|
|
Anthropic Core
Claude Sonnet 4.6
Input
$3.00
Output
$15.00
|
Anthropic Frontier
Claude Opus 4.7
Input
$5.00
Output
$25.00+
|
"Always verify current rates at aws.amazon.com/bedrock/pricing before building a budget. Model rates change frequently."
Rule of thumb for on-demand: Monthly costs typically range from ~$100/month for lightweight prototypes to $5,000+/month once Bedrock Agents, Knowledge Bases, and high-throughput inference are in the picture.
Submit jobs asynchronously. Results arrive within 24 hours. Batch is the immediate choice for any workload that does not require real-time responses, such as enrichment pipelines, document processing, and scheduled report generation. The 50% discount applies automatically.
Pay per hour for dedicated model capacity in exchange for a 1-month or 6-month commitment. Typical discounts of 20–40% vs. on-demand at scale. Worth considering when:
For most teams under 20M requests/month, on-demand is more cost-effective than provisioned throughput.
The biggest pricing surprises come not from model tokens but from adjacent charges:
Seaflux helps engineering teams run an AWS Bedrock cost model before they commit to architecture. Get a free 30-minute scoping call and walk away with a realistic monthly estimate for your use case.
Talk to a Seaflux engineer →| Factor | AWS Bedrock | Self-managed |
|---|---|---|
| Time to first model call | Minutes | Weeks to months |
| Infrastructure management | None | Significant ongoing effort |
| Model choice | 100+ from 18+ providers | Limited to what you deploy |
| Security & compliance | Built-in (IAM, PrivateLink, KMS) | Custom implementation required |
| Fine-tuning | Managed, no GPU setup | Requires ML infrastructure |
| Cost at low scale | Pay per token (efficient) | High fixed infrastructure cost |
| Cost at very high scale | Provisioned throughput + negotiated discounts | Potentially cheaper with full control |
Getting to your first API call takes minutes:
boto3, Node.js, Java) or the Bedrock API directlyAt Seaflux, we build production AI applications for clients across logistics, real estate, fintech, and healthcare, and AWS Bedrock is a core part of our AI infrastructure stack. We use it to deliver:
Whether you are prototyping your first AI feature or scaling a production deployment, choosing the right models, architecture, and pricing tier on Bedrock can save months of rework and significant cost. Seaflux brings hands-on AWS Bedrock experience across logistics, real estate, fintech, and healthcare, from initial scoping through to production delivery.
Let us help you get it right the first time.
Schedule a free consultation with Seaflux →AWS Bedrock is a fully managed cloud service from Amazon Web Services that provides API access to foundation models from Amazon and third-party providers including Anthropic, Meta, Mistral, DeepSeek, OpenAI, and others. It handles infrastructure, security, and scaling so developers can focus on building applications.
As of mid-2026, over 100 foundation models are available across text, image, video, speech, and embedding modalities from 18+ providers. The Bedrock Marketplace adds access to additional specialised and emerging models beyond the core catalogue.
Cost depends heavily on model choice and usage pattern. Amazon Nova Micro starts at $0.035 per million input tokens, making simple tasks very affordable. Heavy use of frontier models like Claude Opus 4.7 or agentic workflows with multi-step reasoning can reach $5,000+/month. Batch inference offers 50% off on-demand rates for non-real-time workloads.
No. AWS guarantees that your data, inputs, and outputs are not used to train or improve any foundation model. Data is encrypted at rest and in transit, and you retain full ownership.
Bedrock is the managed inference and application layer, designed for developers building AI-powered applications. SageMaker is the ML platform, designed for data scientists training, evaluating, and deploying custom models. The two services are complementary; Bedrock Marketplace integrates with SageMaker endpoints for models not in the core Bedrock catalogue.
AgentCore is a production-grade agentic infrastructure layer within Bedrock (GA in 2026). It provides an Agent Registry, managed orchestration for multi-agent pipelines, Payments integration, and the operational tooling needed to run autonomous AI agents reliably in enterprise environments.
Yes, as of April 2026. Following an expanded partnership between AWS and OpenAI, GPT-5.5, GPT-5.4, and Codex are available on Bedrock in limited preview. These models are accessible through standard Bedrock APIs and inherit all AWS security controls including IAM, PrivateLink, Guardrails, and CloudTrail.
It depends on your task. For cost-sensitive classification and routing: Nova Micro. For multimodal tasks: Nova 2 Lite or Nova Pro. For complex long-context reasoning: Claude Opus 4.7 or Nova Premier. For real-time voice applications: Nova 2 Sonic. For code generation at enterprise scale: Codex on Bedrock.
Bedrock offers three billing modes: on-demand (pay per token, no commitment), batch (50% off on-demand, 24-hour turnaround), and provisioned throughput (reserved hourly capacity, 20–40% discount at scale with a 1–6 month commitment). On-demand is recommended for most teams; switch to batch for non-real-time jobs and provisioned when a model exceeds ~$30–40/day on-demand.

Business Development Executive