Claude Certified Architect – Foundations: The Official Exam Guide Explained

Anthropic has officially released the Claude Certified Architect – Foundations certification, a professional credential for solution architects building production-grade applications with Claude. This article provides a comprehensive breakdown of the exam guide, including all domains, sample questions, and preparation strategies.

Source: This article is based on the official Claude Certified Architect – Foundations Certification Exam Guide (Version 0.1, Feb 10 2025) published by Anthropic, PBC.

What Is the Claude Certified Architect – Foundations Certification?

The Claude Certified Architect – Foundations certification validates that practitioners can make informed decisions about tradeoffs when implementing real-world solutions with Claude. The exam tests foundational knowledge across four core technologies:

Claude Code – Team workflows, CLAUDE.md configuration, plan mode
Claude Agent SDK – Multi-agent orchestration, tool integration, lifecycle hooks
Claude API – Structured output, prompt engineering, batch processing
Model Context Protocol (MCP) – Tool design, resource interfaces, backend integration

Target Candidate

The ideal candidate is a solution architect with 6+ months of hands-on experience who has:

Built agentic applications using the Claude Agent SDK, including multi-agent orchestration, subagent delegation, tool integration, and lifecycle hooks
Configured Claude Code for team workflows using CLAUDE.md files, Agent Skills, MCP server integrations, and plan mode
Designed MCP tool and resource interfaces for backend system integration
Engineered prompts that produce reliable structured output using JSON schemas and few-shot examples
Managed context windows effectively across long documents, multi-turn conversations, and multi-agent handoffs
Integrated Claude into CI/CD pipelines for automated code review and test generation
Made sound escalation and reliability decisions, including error handling and human-in-the-loop workflows

Exam Format and Scoring

Item	Detail
Format	Multiple choice (1 correct + 3 distractors per question)
Scoring	Scaled score 100–1,000
Pass Score	720
Unanswered	Scored as incorrect; no penalty for guessing
Result	Pass/Fail

Content Domains and Weightings

The exam has 5 content domains:

Domain	Topic	Weight
Domain 1	Agentic Architecture & Orchestration	27%
Domain 2	Tool Design & MCP Integration	18%
Domain 3	Claude Code Configuration & Workflows	20%
Domain 4	Prompt Engineering & Structured Output	20%
Domain 5	Context Management & Reliability	15%

Exam Scenarios

The exam uses scenario-based questions. During the exam, 4 scenarios are randomly selected from the 6 below:

Scenario 1: Customer Support Resolution Agent

You are building a customer support resolution agent using the Claude Agent SDK. The agent handles high-ambiguity requests like returns, billing disputes, and account issues. It has access to backend systems through custom MCP tools (get_customer, lookup_order, process_refund, escalate_to_human). Your target is 80%+ first-contact resolution while knowing when to escalate.

Primary domains: Agentic Architecture & Orchestration, Tool Design & MCP Integration, Context Management & Reliability

Scenario 2: Code Generation with Claude Code

You are using Claude Code to accelerate software development for code generation, refactoring, debugging, and documentation. You need to integrate it into your development workflow with custom slash commands, CLAUDE.md configurations, and understand when to use plan mode vs direct execution.

Primary domains: Claude Code Configuration & Workflows, Context Management & Reliability

Scenario 3: Multi-Agent Research System

You are building a multi-agent research system using the Claude Agent SDK. A coordinator agent delegates to specialized subagents: one searches the web, one analyzes documents, one synthesizes findings, and one generates reports.

Primary domains: Agentic Architecture & Orchestration, Tool Design & MCP Integration, Context Management & Reliability

Scenario 4: Developer Productivity with Claude

You are building developer productivity tools using the Claude Agent SDK. The agent helps engineers explore unfamiliar codebases, understand legacy systems, generate boilerplate code, and automate repetitive tasks.

Primary domains: Tool Design & MCP Integration, Claude Code Configuration & Workflows, Agentic Architecture & Orchestration

Scenario 5: Claude Code for Continuous Integration

You are integrating Claude Code into your CI/CD pipeline. The system runs automated code reviews, generates test cases, and provides feedback on pull requests.

Primary domains: Claude Code Configuration & Workflows, Prompt Engineering & Structured Output

Scenario 6: Structured Data Extraction

You are building a structured data extraction system using Claude. The system extracts information from unstructured documents, validates the output using JSON schemas, and maintains high accuracy while handling edge cases gracefully.

Primary domains: Prompt Engineering & Structured Output, Context Management & Reliability

Domain Deep Dive

Domain 1: Agentic Architecture & Orchestration (27%)

Key Task Statements:

1.1 Agentic Loop Design

Understanding the lifecycle: send request → inspect stop_reason → execute tools → return results
Continuing when stop_reason is "tool_use", terminating when "end_turn"
Avoiding anti-patterns: parsing natural language signals, setting arbitrary iteration caps

1.2 Multi-Agent Orchestration (Hub-and-Spoke)

Coordinator manages all inter-subagent communication
Subagents operate with isolated context — they do NOT inherit coordinator history
Coordinator: task decomposition, delegation, result aggregation, iterative refinement

1.3 Subagent Context Passing

The Task tool is the mechanism for spawning subagents
allowedTools must include "Task" for coordinator to invoke subagents
Subagent context must be explicitly provided in the prompt
Spawn parallel subagents by emitting multiple Task tool calls in a single coordinator response

1.4 Multi-Step Workflows

Programmatic enforcement (hooks, prerequisite gates) > prompt-based guidance for deterministic compliance
Structured handoff protocols for mid-process escalation

1.5 Agent SDK Hooks

PostToolUse hooks for data normalization before model processes results
Tool call interception for business rule enforcement (e.g., blocking refunds > $500)
Hooks provide deterministic guarantees vs. prompt instructions (probabilistic)

1.6 Task Decomposition

Fixed sequential pipelines (prompt chaining) for predictable workflows
Dynamic adaptive decomposition for open-ended investigation tasks
Split large code reviews: per-file local analysis + separate cross-file integration pass

1.7 Session Management

--resume <session-name> for continuing prior sessions
fork_session for exploring divergent approaches from a shared baseline
New session with injected summary > resuming with stale tool results

Domain 2: Tool Design & MCP Integration (18%)

Key Task Statements:

2.1 Effective Tool Interface Design

Tool descriptions are the primary mechanism LLMs use for tool selection
Include: input formats, example queries, edge cases, boundary explanations
Rename/split tools to eliminate functional overlap

2.2 Structured Error Responses

Use the MCP isError flag pattern
Return: errorCategory (transient/validation/permission), isRetryable boolean, human-readable description
Distinguish transient errors from valid empty results

2.3 Tool Distribution and tool_choice

Too many tools (e.g., 18 instead of 4-5) degrades tool selection reliability
tool_choice options: "auto", "any", forced selection {"type": "tool", "name": "..."}
Setting tool_choice: "any" guarantees the model calls a tool (not conversational text)

2.4 MCP Server Integration

Project-level (.mcp.json) for shared team tooling
User-level (~/.claude.json) for personal/experimental servers
Environment variable expansion in .mcp.json for credential management (e.g., ${GITHUB_TOKEN})
MCP resources expose content catalogs to reduce exploratory tool calls

2.5 Built-in Tools Selection

Grep → content search (function names, error messages, import statements)
Glob → file path pattern matching (**/*.test.tsx)
Read/Write → full file operations; Edit → targeted modifications
Fallback: Read + Write when Edit fails due to non-unique text matches

Domain 3: Claude Code Configuration & Workflows (20%)

Key Task Statements:

3.1 CLAUDE.md Hierarchy

User-level (~/.claude/CLAUDE.md) → not shared via version control
Project-level (.claude/CLAUDE.md or root CLAUDE.md) → shared with team
Directory-level (subdirectory CLAUDE.md) → scoped to that directory
@import syntax for modular file referencing
.claude/rules/ for topic-specific rule files

3.2 Custom Slash Commands and Skills

Project-scoped: .claude/commands/ (version-controlled, shared)
User-scoped: ~/.claude/commands/ (personal)
Skills in .claude/skills/ with SKILL.md frontmatter:
- context: fork → runs in isolated sub-agent context
- allowed-tools → restricts tool access during skill execution
- argument-hint → prompts for required parameters

3.3 Path-Specific Rules

.claude/rules/ files with YAML frontmatter paths fields containing glob patterns
Rules load only when editing matching files
Better than directory-level CLAUDE.md for conventions spanning multiple directories

3.4 Plan Mode vs Direct Execution

Plan mode: complex tasks with large-scale changes, multiple valid approaches, architectural decisions, multi-file modifications
Direct execution: simple, well-scoped changes (single-file bug fix with clear stack trace)
Use the Explore subagent for verbose discovery phases

3.5 Iterative Refinement

Concrete input/output examples > prose descriptions when inconsistently interpreted
Test-driven iteration: write tests first, share failures to guide improvement
Interview pattern: have Claude ask questions before implementing
Multiple interacting issues → single detailed message; independent issues → sequential iteration

3.6 CI/CD Integration

-p (or --print) flag for non-interactive mode in automated pipelines
--output-format json with --json-schema for machine-parseable output
Use independent review instances (not self-review) for catching subtle issues
Include prior review findings to avoid duplicate comments on re-runs

Domain 4: Prompt Engineering & Structured Output (20%)

Key Task Statements:

4.1 Explicit Review Criteria

Explicit criteria > vague instructions ("flag only when claimed behavior contradicts actual code" vs "check comments are accurate")
Temporarily disable high false-positive categories to restore developer trust

4.2 Few-Shot Prompting

Most effective for achieving consistently formatted, actionable output
2-4 targeted examples for ambiguous scenarios
Include examples showing reasoning for why one action was chosen over plausible alternatives
Demonstrate correct handling of varied document structures

4.3 Structured Output via Tool Use

tool_use with JSON schemas → most reliable approach for guaranteed schema compliance
Eliminates JSON syntax errors but NOT semantic errors (values don't sum, wrong field placement)
tool_choice: "any" → guarantees a tool call when document type is unknown
Design schema fields as optional (nullable) when source may not contain the information

4.4 Validation and Retry Loops

Retry-with-error-feedback: append specific validation errors to prompt on retry
Retries are ineffective when required information is simply absent from source
Track detected_pattern fields to analyze false positive patterns

4.5 Batch Processing

Message Batches API: 50% cost savings, up to 24-hour processing window, no guaranteed latency SLA
Appropriate for: non-blocking, latency-tolerant workloads (overnight reports, weekly audits)
NOT appropriate for: blocking workflows (pre-merge checks)
custom_id fields for correlating request/response pairs

4.6 Multi-Instance Review

Self-review limitation: model retains generation reasoning context
Independent review instances more effective than self-review instructions or extended thinking
Multi-pass: per-file local analysis + separate cross-file integration pass

Domain 5: Context Management & Reliability (15%)

Key Task Statements:

5.1 Context Preservation

Progressive summarization risks: condensing numerical values, dates, customer expectations
"Lost in the middle" effect: models reliably process info at beginning and end of long inputs
Extract transactional facts into a persistent "case facts" block
Trim verbose tool outputs to relevant fields before they accumulate

5.2 Escalation and Ambiguity Resolution

Escalation triggers: customer requests for human, policy exceptions/gaps, inability to make progress
Honor explicit customer requests for human agents immediately
Sentiment-based escalation and self-reported confidence scores are unreliable proxies for complexity
Multiple customer matches → request additional identifiers, don't use heuristic selection

5.3 Error Propagation in Multi-Agent Systems

Return structured error context: failure type, attempted query, partial results, alternative approaches
Distinguish access failures (needing retry) from valid empty results
Subagents: local recovery for transient failures; propagate only unresolvable errors with context

5.4 Large Codebase Exploration

Context degradation: models give inconsistent answers, reference "typical patterns" instead of specific classes
Scratchpad files for persisting key findings across context boundaries
Subagent delegation for isolating verbose exploration output
Use /compact to reduce context usage during extended sessions

5.5 Human Review Workflows

Aggregate accuracy metrics (97% overall) may mask poor performance on specific document types
Stratified random sampling for measuring error rates in high-confidence extractions
Field-level confidence scores calibrated using labeled validation sets

5.6 Information Provenance

Source attribution is lost during summarization without claim-source mappings
Handle conflicting statistics: annotate conflicts with source attribution, don't arbitrarily select one
Require publication/collection dates in structured outputs for temporal data

Sample Questions with Explanations

Q1 (Scenario: Customer Support Agent)

Production data shows that in 12% of cases, your agent skips get_customer entirely and calls lookup_order using only the customer's stated name, occasionally leading to misidentified accounts. What change would most effectively address this?

A) Add a programmatic prerequisite that blocks lookup_order and process_refund until get_customer returns a verified customer ID ✅
B) Enhance the system prompt to state customer verification is mandatory
C) Add few-shot examples showing the agent always calling get_customer first
D) Implement a routing classifier that enables only appropriate tools per request type

Answer: A — When a specific tool sequence is required for critical business logic, programmatic enforcement provides deterministic guarantees that prompt-based approaches cannot. Options B and C rely on probabilistic LLM compliance, insufficient when errors have financial consequences.

Q4 (Scenario: Code Generation with Claude Code)

You want to create a custom /review slash command available to every developer when they clone the repository. Where should you create this command file?

A) In .claude/commands/ in the project repository ✅
B) In ~/.claude/commands/ in each developer's home directory
C) In the CLAUDE.md file at the project root
D) In a .claude/config.json file with a commands array

Answer: A — Project-scoped custom slash commands stored in .claude/commands/ are version-controlled and automatically available to all developers when they clone the repo.

Q10 (Scenario: CI/CD Pipeline)

Your pipeline runs claude "Analyze this pull request for security issues" but the job hangs indefinitely. Claude Code is waiting for interactive input. What's the correct fix?

A) Add the -p flag: claude -p "Analyze this pull request for security issues" ✅
B) Set CLAUDE_HEADLESS=true before running
C) Redirect stdin from /dev/null
D) Add the --batch flag

Answer: A — The -p (--print) flag runs Claude Code in non-interactive mode, processes the prompt, outputs to stdout, and exits. The other options reference non-existent features.

Q11 (Scenario: CI/CD Pipeline)

Two workflows: (1) blocking pre-merge check, (2) overnight technical debt report. Should you switch both to the Message Batches API for 50% cost savings?

A) Use batch processing for technical debt reports only; keep real-time for pre-merge checks ✅
B) Switch both with status polling for completion
C) Keep real-time for both
D) Switch both with timeout fallback to real-time

Answer: A — The Batches API has up to 24-hour processing time with no SLA, making it unsuitable for blocking workflows but ideal for overnight jobs.

Preparation Exercises

Exercise 1: Build a Multi-Tool Agent with Escalation Logic

Define 3-4 MCP tools with detailed descriptions that clearly differentiate each tool's purpose
Implement an agentic loop checking stop_reason to continue or terminate
Add structured error responses with errorCategory, isRetryable, and human-readable descriptions
Implement a programmatic hook intercepting tool calls to enforce business rules
Test with multi-concern messages and verify unified response synthesis

Exercise 2: Configure Claude Code for Team Development

Create project-level CLAUDE.md with universal coding standards
Create .claude/rules/ files with YAML frontmatter glob patterns for different code areas
Create a project-scoped skill with context: fork and allowed-tools restrictions
Configure an MCP server in .mcp.json with environment variable expansion
Test plan mode vs direct execution on tasks of varying complexity

Exercise 3: Build a Structured Data Extraction Pipeline

Define an extraction tool with JSON schema (required/optional fields, enum + "other" pattern, nullable fields)
Implement a validation-retry loop with specific error feedback on retry
Add few-shot examples for documents with varied formats
Submit 100 documents via the Message Batches API, handle failures by custom_id
Implement human review routing with field-level confidence scores

Exercise 4: Design a Multi-Agent Research Pipeline

Build a coordinator agent with allowedTools including "Task", passing findings explicitly to subagents
Implement parallel subagent execution with multiple Task tool calls in a single response
Design structured output separating content from metadata (claim, evidence, source URL, publication date)
Simulate a subagent timeout; verify coordinator receives structured error context
Test with conflicting source data; verify synthesis preserves both values with source attribution

Key Technologies Reference

Technology	Key Concepts
Claude Agent SDK	Agentic loops, stop_reason, PostToolUse hooks, Task tool, allowedTools
MCP	isError flag, tool descriptions, .mcp.json, environment variable expansion
Claude Code	CLAUDE.md hierarchy, .claude/rules/, .claude/commands/, .claude/skills/, plan mode
Claude Code CLI	`-p/--print`, `--output-format json`, `--json-schema`
Claude API	tool_use, tool_choice (auto/any/forced), stop_reason, max_tokens
Message Batches API	50% cost savings, 24-hour window, custom_id, no multi-turn tool calling
JSON Schema	Required/optional, enum, nullable, strict mode

Out-of-Scope Topics (Will NOT Appear on Exam)

Fine-tuning Claude models or training custom models
Claude API authentication, billing, or account management
Deploying or hosting MCP servers (infrastructure, networking)
Claude's internal architecture or training process
Constitutional AI, RLHF, or safety training methodologies
Computer use, vision/image analysis capabilities
Streaming API implementation or server-sent events
Rate limiting, quotas, or API pricing calculations
Specific cloud provider configurations (AWS, GCP, Azure)

Final Preparation Checklist

Build an agent with the Claude Agent SDK — complete agentic loop, tool calling, error handling, session management
Configure Claude Code for a real project — CLAUDE.md hierarchy, path-specific rules, custom skills with frontmatter, at least one MCP server
Design and test MCP tools — clear descriptions, structured error responses, test with ambiguous requests
Build a structured data extraction pipeline — tool_use with JSON schemas, validation-retry loops, batch processing
Practice prompt engineering — few-shot examples for ambiguous scenarios, explicit review criteria, multi-pass review architectures
Study context management — structured fact extraction, scratchpad files, subagent delegation
Review escalation patterns — when to escalate vs resolve autonomously, human review workflows with confidence-based routing
Complete the Practice Exam — covers the same scenarios and question format, shows explanations after each answer

The Claude Certified Architect – Foundations exam represents Anthropic's commitment to establishing professional standards for Claude application development. Whether you're building customer support agents, CI/CD integrations, or complex multi-agent research systems, this certification validates the architectural judgment needed to build reliable, production-grade Claude applications.

Claude Certified Architect – Foundations: The Official Exam Guide Explained

Claude Certified Architect – Foundations: The Official Exam Guide Explained

What Is the Claude Certified Architect – Foundations Certification?

Target Candidate

Exam Format and Scoring

Content Domains and Weightings

Exam Scenarios

Scenario 1: Customer Support Resolution Agent

Scenario 2: Code Generation with Claude Code

Scenario 3: Multi-Agent Research System

Scenario 4: Developer Productivity with Claude

Scenario 5: Claude Code for Continuous Integration

Scenario 6: Structured Data Extraction

Domain Deep Dive

Domain 1: Agentic Architecture & Orchestration (27%)

Domain 2: Tool Design & MCP Integration (18%)

Domain 3: Claude Code Configuration & Workflows (20%)

Domain 4: Prompt Engineering & Structured Output (20%)

Domain 5: Context Management & Reliability (15%)

Sample Questions with Explanations

Q1 (Scenario: Customer Support Agent)

Q4 (Scenario: Code Generation with Claude Code)

Q10 (Scenario: CI/CD Pipeline)

Q11 (Scenario: CI/CD Pipeline)

Preparation Exercises

Exercise 1: Build a Multi-Tool Agent with Escalation Logic

Exercise 2: Configure Claude Code for Team Development

Exercise 3: Build a Structured Data Extraction Pipeline

Exercise 4: Design a Multi-Agent Research Pipeline

Key Technologies Reference

Out-of-Scope Topics (Will NOT Appear on Exam)

Final Preparation Checklist

Related Articles

System Prompt Design: Make Claude Understand Exactly What You Need

Structured Output and Multimodal: Formatted Responses and Vision

The Complete Guide to Prompt Engineering with Claude

Chain of Thought Prompting: Solve Complex Problems Step by Step with Claude

Try these free online tools