MagicTools
ai-tutorialsMarch 18, 2026543 views14 min read

Claude Certified Architect – Foundations: The Official Exam Guide Explained

Claude Certified Architect – Foundations: The Official Exam Guide Explained

Anthropic has officially released the Claude Certified Architect – Foundations certification, a professional credential for solution architects building production-grade applications with Claude. This article provides a comprehensive breakdown of the exam guide, including all domains, sample questions, and preparation strategies.

Source: This article is based on the official Claude Certified Architect – Foundations Certification Exam Guide (Version 0.1, Feb 10 2025) published by Anthropic, PBC.


What Is the Claude Certified Architect – Foundations Certification?

The Claude Certified Architect – Foundations certification validates that practitioners can make informed decisions about tradeoffs when implementing real-world solutions with Claude. The exam tests foundational knowledge across four core technologies:

  • Claude Code – Team workflows, CLAUDE.md configuration, plan mode
  • Claude Agent SDK – Multi-agent orchestration, tool integration, lifecycle hooks
  • Claude API – Structured output, prompt engineering, batch processing
  • Model Context Protocol (MCP) – Tool design, resource interfaces, backend integration

Target Candidate

The ideal candidate is a solution architect with 6+ months of hands-on experience who has:

  • Built agentic applications using the Claude Agent SDK, including multi-agent orchestration, subagent delegation, tool integration, and lifecycle hooks
  • Configured Claude Code for team workflows using CLAUDE.md files, Agent Skills, MCP server integrations, and plan mode
  • Designed MCP tool and resource interfaces for backend system integration
  • Engineered prompts that produce reliable structured output using JSON schemas and few-shot examples
  • Managed context windows effectively across long documents, multi-turn conversations, and multi-agent handoffs
  • Integrated Claude into CI/CD pipelines for automated code review and test generation
  • Made sound escalation and reliability decisions, including error handling and human-in-the-loop workflows

Exam Format and Scoring

Item Detail
Format Multiple choice (1 correct + 3 distractors per question)
Scoring Scaled score 100–1,000
Pass Score 720
Unanswered Scored as incorrect; no penalty for guessing
Result Pass/Fail

Content Domains and Weightings

The exam has 5 content domains:

Domain Topic Weight
Domain 1 Agentic Architecture & Orchestration 27%
Domain 2 Tool Design & MCP Integration 18%
Domain 3 Claude Code Configuration & Workflows 20%
Domain 4 Prompt Engineering & Structured Output 20%
Domain 5 Context Management & Reliability 15%

Exam Scenarios

The exam uses scenario-based questions. During the exam, 4 scenarios are randomly selected from the 6 below:

Scenario 1: Customer Support Resolution Agent

You are building a customer support resolution agent using the Claude Agent SDK. The agent handles high-ambiguity requests like returns, billing disputes, and account issues. It has access to backend systems through custom MCP tools (get_customer, lookup_order, process_refund, escalate_to_human). Your target is 80%+ first-contact resolution while knowing when to escalate.

Primary domains: Agentic Architecture & Orchestration, Tool Design & MCP Integration, Context Management & Reliability

Scenario 2: Code Generation with Claude Code

You are using Claude Code to accelerate software development for code generation, refactoring, debugging, and documentation. You need to integrate it into your development workflow with custom slash commands, CLAUDE.md configurations, and understand when to use plan mode vs direct execution.

Primary domains: Claude Code Configuration & Workflows, Context Management & Reliability

Scenario 3: Multi-Agent Research System

You are building a multi-agent research system using the Claude Agent SDK. A coordinator agent delegates to specialized subagents: one searches the web, one analyzes documents, one synthesizes findings, and one generates reports.

Primary domains: Agentic Architecture & Orchestration, Tool Design & MCP Integration, Context Management & Reliability

Scenario 4: Developer Productivity with Claude

You are building developer productivity tools using the Claude Agent SDK. The agent helps engineers explore unfamiliar codebases, understand legacy systems, generate boilerplate code, and automate repetitive tasks.

Primary domains: Tool Design & MCP Integration, Claude Code Configuration & Workflows, Agentic Architecture & Orchestration

Scenario 5: Claude Code for Continuous Integration

You are integrating Claude Code into your CI/CD pipeline. The system runs automated code reviews, generates test cases, and provides feedback on pull requests.

Primary domains: Claude Code Configuration & Workflows, Prompt Engineering & Structured Output

Scenario 6: Structured Data Extraction

You are building a structured data extraction system using Claude. The system extracts information from unstructured documents, validates the output using JSON schemas, and maintains high accuracy while handling edge cases gracefully.

Primary domains: Prompt Engineering & Structured Output, Context Management & Reliability


Domain Deep Dive

Domain 1: Agentic Architecture & Orchestration (27%)

Key Task Statements:

1.1 Agentic Loop Design

  • Understanding the lifecycle: send request → inspect stop_reason → execute tools → return results
  • Continuing when stop_reason is "tool_use", terminating when "end_turn"
  • Avoiding anti-patterns: parsing natural language signals, setting arbitrary iteration caps

1.2 Multi-Agent Orchestration (Hub-and-Spoke)

  • Coordinator manages all inter-subagent communication
  • Subagents operate with isolated context — they do NOT inherit coordinator history
  • Coordinator: task decomposition, delegation, result aggregation, iterative refinement

1.3 Subagent Context Passing

  • The Task tool is the mechanism for spawning subagents
  • allowedTools must include "Task" for coordinator to invoke subagents
  • Subagent context must be explicitly provided in the prompt
  • Spawn parallel subagents by emitting multiple Task tool calls in a single coordinator response

1.4 Multi-Step Workflows

  • Programmatic enforcement (hooks, prerequisite gates) > prompt-based guidance for deterministic compliance
  • Structured handoff protocols for mid-process escalation

1.5 Agent SDK Hooks

  • PostToolUse hooks for data normalization before model processes results
  • Tool call interception for business rule enforcement (e.g., blocking refunds > $500)
  • Hooks provide deterministic guarantees vs. prompt instructions (probabilistic)

1.6 Task Decomposition

  • Fixed sequential pipelines (prompt chaining) for predictable workflows
  • Dynamic adaptive decomposition for open-ended investigation tasks
  • Split large code reviews: per-file local analysis + separate cross-file integration pass

1.7 Session Management

  • --resume <session-name> for continuing prior sessions
  • fork_session for exploring divergent approaches from a shared baseline
  • New session with injected summary > resuming with stale tool results

Domain 2: Tool Design & MCP Integration (18%)

Key Task Statements:

2.1 Effective Tool Interface Design

  • Tool descriptions are the primary mechanism LLMs use for tool selection
  • Include: input formats, example queries, edge cases, boundary explanations
  • Rename/split tools to eliminate functional overlap

2.2 Structured Error Responses

  • Use the MCP isError flag pattern
  • Return: errorCategory (transient/validation/permission), isRetryable boolean, human-readable description
  • Distinguish transient errors from valid empty results

2.3 Tool Distribution and tool_choice

  • Too many tools (e.g., 18 instead of 4-5) degrades tool selection reliability
  • tool_choice options: "auto", "any", forced selection {"type": "tool", "name": "..."}
  • Setting tool_choice: "any" guarantees the model calls a tool (not conversational text)

2.4 MCP Server Integration

  • Project-level (.mcp.json) for shared team tooling
  • User-level (~/.claude.json) for personal/experimental servers
  • Environment variable expansion in .mcp.json for credential management (e.g., ${GITHUB_TOKEN})
  • MCP resources expose content catalogs to reduce exploratory tool calls

2.5 Built-in Tools Selection

  • Grep → content search (function names, error messages, import statements)
  • Glob → file path pattern matching (**/*.test.tsx)
  • Read/Write → full file operations; Edit → targeted modifications
  • Fallback: Read + Write when Edit fails due to non-unique text matches

Domain 3: Claude Code Configuration & Workflows (20%)

Key Task Statements:

3.1 CLAUDE.md Hierarchy

  • User-level (~/.claude/CLAUDE.md) → not shared via version control
  • Project-level (.claude/CLAUDE.md or root CLAUDE.md) → shared with team
  • Directory-level (subdirectory CLAUDE.md) → scoped to that directory
  • @import syntax for modular file referencing
  • .claude/rules/ for topic-specific rule files

3.2 Custom Slash Commands and Skills

  • Project-scoped: .claude/commands/ (version-controlled, shared)
  • User-scoped: ~/.claude/commands/ (personal)
  • Skills in .claude/skills/ with SKILL.md frontmatter:
    • context: fork → runs in isolated sub-agent context
    • allowed-tools → restricts tool access during skill execution
    • argument-hint → prompts for required parameters

3.3 Path-Specific Rules

  • .claude/rules/ files with YAML frontmatter paths fields containing glob patterns
  • Rules load only when editing matching files
  • Better than directory-level CLAUDE.md for conventions spanning multiple directories

3.4 Plan Mode vs Direct Execution

  • Plan mode: complex tasks with large-scale changes, multiple valid approaches, architectural decisions, multi-file modifications
  • Direct execution: simple, well-scoped changes (single-file bug fix with clear stack trace)
  • Use the Explore subagent for verbose discovery phases

3.5 Iterative Refinement

  • Concrete input/output examples > prose descriptions when inconsistently interpreted
  • Test-driven iteration: write tests first, share failures to guide improvement
  • Interview pattern: have Claude ask questions before implementing
  • Multiple interacting issues → single detailed message; independent issues → sequential iteration

3.6 CI/CD Integration

  • -p (or --print) flag for non-interactive mode in automated pipelines
  • --output-format json with --json-schema for machine-parseable output
  • Use independent review instances (not self-review) for catching subtle issues
  • Include prior review findings to avoid duplicate comments on re-runs

Domain 4: Prompt Engineering & Structured Output (20%)

Key Task Statements:

4.1 Explicit Review Criteria

  • Explicit criteria > vague instructions ("flag only when claimed behavior contradicts actual code" vs "check comments are accurate")
  • Temporarily disable high false-positive categories to restore developer trust

4.2 Few-Shot Prompting

  • Most effective for achieving consistently formatted, actionable output
  • 2-4 targeted examples for ambiguous scenarios
  • Include examples showing reasoning for why one action was chosen over plausible alternatives
  • Demonstrate correct handling of varied document structures

4.3 Structured Output via Tool Use

  • tool_use with JSON schemas → most reliable approach for guaranteed schema compliance
  • Eliminates JSON syntax errors but NOT semantic errors (values don't sum, wrong field placement)
  • tool_choice: "any" → guarantees a tool call when document type is unknown
  • Design schema fields as optional (nullable) when source may not contain the information

4.4 Validation and Retry Loops

  • Retry-with-error-feedback: append specific validation errors to prompt on retry
  • Retries are ineffective when required information is simply absent from source
  • Track detected_pattern fields to analyze false positive patterns

4.5 Batch Processing

  • Message Batches API: 50% cost savings, up to 24-hour processing window, no guaranteed latency SLA
  • Appropriate for: non-blocking, latency-tolerant workloads (overnight reports, weekly audits)
  • NOT appropriate for: blocking workflows (pre-merge checks)
  • custom_id fields for correlating request/response pairs

4.6 Multi-Instance Review

  • Self-review limitation: model retains generation reasoning context
  • Independent review instances more effective than self-review instructions or extended thinking
  • Multi-pass: per-file local analysis + separate cross-file integration pass

Domain 5: Context Management & Reliability (15%)

Key Task Statements:

5.1 Context Preservation

  • Progressive summarization risks: condensing numerical values, dates, customer expectations
  • "Lost in the middle" effect: models reliably process info at beginning and end of long inputs
  • Extract transactional facts into a persistent "case facts" block
  • Trim verbose tool outputs to relevant fields before they accumulate

5.2 Escalation and Ambiguity Resolution

  • Escalation triggers: customer requests for human, policy exceptions/gaps, inability to make progress
  • Honor explicit customer requests for human agents immediately
  • Sentiment-based escalation and self-reported confidence scores are unreliable proxies for complexity
  • Multiple customer matches → request additional identifiers, don't use heuristic selection

5.3 Error Propagation in Multi-Agent Systems

  • Return structured error context: failure type, attempted query, partial results, alternative approaches
  • Distinguish access failures (needing retry) from valid empty results
  • Subagents: local recovery for transient failures; propagate only unresolvable errors with context

5.4 Large Codebase Exploration

  • Context degradation: models give inconsistent answers, reference "typical patterns" instead of specific classes
  • Scratchpad files for persisting key findings across context boundaries
  • Subagent delegation for isolating verbose exploration output
  • Use /compact to reduce context usage during extended sessions

5.5 Human Review Workflows

  • Aggregate accuracy metrics (97% overall) may mask poor performance on specific document types
  • Stratified random sampling for measuring error rates in high-confidence extractions
  • Field-level confidence scores calibrated using labeled validation sets

5.6 Information Provenance

  • Source attribution is lost during summarization without claim-source mappings
  • Handle conflicting statistics: annotate conflicts with source attribution, don't arbitrarily select one
  • Require publication/collection dates in structured outputs for temporal data

Sample Questions with Explanations

Q1 (Scenario: Customer Support Agent)

Production data shows that in 12% of cases, your agent skips get_customer entirely and calls lookup_order using only the customer's stated name, occasionally leading to misidentified accounts. What change would most effectively address this?

  • A) Add a programmatic prerequisite that blocks lookup_order and process_refund until get_customer returns a verified customer ID ✅
  • B) Enhance the system prompt to state customer verification is mandatory
  • C) Add few-shot examples showing the agent always calling get_customer first
  • D) Implement a routing classifier that enables only appropriate tools per request type

Answer: A — When a specific tool sequence is required for critical business logic, programmatic enforcement provides deterministic guarantees that prompt-based approaches cannot. Options B and C rely on probabilistic LLM compliance, insufficient when errors have financial consequences.


Q4 (Scenario: Code Generation with Claude Code)

You want to create a custom /review slash command available to every developer when they clone the repository. Where should you create this command file?

  • A) In .claude/commands/ in the project repository ✅
  • B) In ~/.claude/commands/ in each developer's home directory
  • C) In the CLAUDE.md file at the project root
  • D) In a .claude/config.json file with a commands array

Answer: A — Project-scoped custom slash commands stored in .claude/commands/ are version-controlled and automatically available to all developers when they clone the repo.


Q10 (Scenario: CI/CD Pipeline)

Your pipeline runs claude "Analyze this pull request for security issues" but the job hangs indefinitely. Claude Code is waiting for interactive input. What's the correct fix?

  • A) Add the -p flag: claude -p "Analyze this pull request for security issues"
  • B) Set CLAUDE_HEADLESS=true before running
  • C) Redirect stdin from /dev/null
  • D) Add the --batch flag

Answer: A — The -p (--print) flag runs Claude Code in non-interactive mode, processes the prompt, outputs to stdout, and exits. The other options reference non-existent features.


Q11 (Scenario: CI/CD Pipeline)

Two workflows: (1) blocking pre-merge check, (2) overnight technical debt report. Should you switch both to the Message Batches API for 50% cost savings?

  • A) Use batch processing for technical debt reports only; keep real-time for pre-merge checks ✅
  • B) Switch both with status polling for completion
  • C) Keep real-time for both
  • D) Switch both with timeout fallback to real-time

Answer: A — The Batches API has up to 24-hour processing time with no SLA, making it unsuitable for blocking workflows but ideal for overnight jobs.


Preparation Exercises

Exercise 1: Build a Multi-Tool Agent with Escalation Logic

  1. Define 3-4 MCP tools with detailed descriptions that clearly differentiate each tool's purpose
  2. Implement an agentic loop checking stop_reason to continue or terminate
  3. Add structured error responses with errorCategory, isRetryable, and human-readable descriptions
  4. Implement a programmatic hook intercepting tool calls to enforce business rules
  5. Test with multi-concern messages and verify unified response synthesis

Exercise 2: Configure Claude Code for Team Development

  1. Create project-level CLAUDE.md with universal coding standards
  2. Create .claude/rules/ files with YAML frontmatter glob patterns for different code areas
  3. Create a project-scoped skill with context: fork and allowed-tools restrictions
  4. Configure an MCP server in .mcp.json with environment variable expansion
  5. Test plan mode vs direct execution on tasks of varying complexity

Exercise 3: Build a Structured Data Extraction Pipeline

  1. Define an extraction tool with JSON schema (required/optional fields, enum + "other" pattern, nullable fields)
  2. Implement a validation-retry loop with specific error feedback on retry
  3. Add few-shot examples for documents with varied formats
  4. Submit 100 documents via the Message Batches API, handle failures by custom_id
  5. Implement human review routing with field-level confidence scores

Exercise 4: Design a Multi-Agent Research Pipeline

  1. Build a coordinator agent with allowedTools including "Task", passing findings explicitly to subagents
  2. Implement parallel subagent execution with multiple Task tool calls in a single response
  3. Design structured output separating content from metadata (claim, evidence, source URL, publication date)
  4. Simulate a subagent timeout; verify coordinator receives structured error context
  5. Test with conflicting source data; verify synthesis preserves both values with source attribution

Key Technologies Reference

Technology Key Concepts
Claude Agent SDK Agentic loops, stop_reason, PostToolUse hooks, Task tool, allowedTools
MCP isError flag, tool descriptions, .mcp.json, environment variable expansion
Claude Code CLAUDE.md hierarchy, .claude/rules/, .claude/commands/, .claude/skills/, plan mode
Claude Code CLI -p/--print, --output-format json, --json-schema
Claude API tool_use, tool_choice (auto/any/forced), stop_reason, max_tokens
Message Batches API 50% cost savings, 24-hour window, custom_id, no multi-turn tool calling
JSON Schema Required/optional, enum, nullable, strict mode

Out-of-Scope Topics (Will NOT Appear on Exam)

  • Fine-tuning Claude models or training custom models
  • Claude API authentication, billing, or account management
  • Deploying or hosting MCP servers (infrastructure, networking)
  • Claude's internal architecture or training process
  • Constitutional AI, RLHF, or safety training methodologies
  • Computer use, vision/image analysis capabilities
  • Streaming API implementation or server-sent events
  • Rate limiting, quotas, or API pricing calculations
  • Specific cloud provider configurations (AWS, GCP, Azure)

Final Preparation Checklist

  1. Build an agent with the Claude Agent SDK — complete agentic loop, tool calling, error handling, session management
  2. Configure Claude Code for a real project — CLAUDE.md hierarchy, path-specific rules, custom skills with frontmatter, at least one MCP server
  3. Design and test MCP tools — clear descriptions, structured error responses, test with ambiguous requests
  4. Build a structured data extraction pipeline — tool_use with JSON schemas, validation-retry loops, batch processing
  5. Practice prompt engineering — few-shot examples for ambiguous scenarios, explicit review criteria, multi-pass review architectures
  6. Study context management — structured fact extraction, scratchpad files, subagent delegation
  7. Review escalation patterns — when to escalate vs resolve autonomously, human review workflows with confidence-based routing
  8. Complete the Practice Exam — covers the same scenarios and question format, shows explanations after each answer

The Claude Certified Architect – Foundations exam represents Anthropic's commitment to establishing professional standards for Claude application development. Whether you're building customer support agents, CI/CD integrations, or complex multi-agent research systems, this certification validates the architectural judgment needed to build reliable, production-grade Claude applications.

Published by MagicTools