Claude Code 2.1 introduces two capabilities that reshape how senior developers handle complex refactoring, multi-file migrations, and production-grade code generation: the xhigh effort tier and a native auto-verification loop that operates as a first-class feature rather than an afterthought.
Important: This article describes anticipated features of Claude Code 2.1. All configuration keys, model identifiers, CLI flags, environment variables, and behavioral details described below are unverified against official Anthropic documentation at time of writing. Verify every setting against the official Anthropic docs, CLI help output, and release notes before use in development or CI/CD environments.
Table of Contents
- What Changed in Claude Code 2.1 and Why xHigh Matters
- Configuring xHigh Effort Mode
- Implementing Auto-Verification Workflows
- Cost Optimization Strategies for xHigh
- Production Patterns and Real-World Workflows
- Implementation Checklist and Quick Reference
What Changed in Claude Code 2.1 and Why xHigh Matters
The Effort Tier System Explained (low, medium, high, xhigh)
Claude Code's effort tier system maps directly to the depth of reasoning the model applies to any given task. The four tiers (low, medium, high, and xhigh) control how much chain-of-thought processing the model performs before producing output. Responses at low are fast and shallow, suitable for completions and simple lookups. Medium adds basic planning, and high introduces multi-step reasoning with moderate context use.
The xhigh tier unlocks what the lower tiers cannot: extended chain-of-thought that spans multiple reasoning passes, multi-pass planning where the model revisits and refines its approach before generating code, and deeper use of the available context window. For tasks involving inter-file dependency resolution, architectural reasoning across module boundaries, or migration logic that must account for cascading side effects, xhigh is often the only tier that produces reliable results on the first attempt.
For tasks involving inter-file dependency resolution, architectural reasoning across module boundaries, or migration logic that must account for cascading side effects, xhigh is often the only tier that produces reliable results on the first attempt.
That said, xhigh is overkill for boilerplate generation, simple CRUD operations, or any task where the correct output is structurally obvious. Applying it indiscriminately leads to unnecessary cost and slower response times with no meaningful quality improvement.
Auto-Verification as a First-Class Feature
In earlier Claude Code versions, verification required manual invocation and custom scripting to close the feedback loop. Claude Code 2.1 promotes auto-verification to a native workflow: generate code, run lint and type checks, execute tests, evaluate the output against pass criteria, then either self-correct and retry or confirm the result.
This verification loop architecture is built into the runtime rather than bolted on. Claude Code 2.1 uses heuristic detection of project tooling to select appropriate verification steps automatically, though developers can override this with explicit configuration. Verification can now be configured as a persistent project-level default rather than requiring per-invocation flags. Note that autoVerify defaults to false; you must explicitly enable it to activate the persistent verification loop.
The minimum viable configuration to activate both features:
{
"effort": "xhigh",
"autoVerify": true
}
This .claude/settings.json snippet enables xhigh reasoning and the auto-verification loop for every task in the project scope.
Configuring xHigh Effort Mode
Global vs. Project-Level Configuration
Note: The
claude config setsubcommand, all environment variables, and all CLI flags described in this section are unverified. Confirm availability viaclaude config --helpandclaude --helpbefore use.
xHigh can be set at three levels depending on the desired scope. For global activation across all projects:
claude config set effort xhigh
For project-scoped overrides, the .claude/settings.json file in the project root takes precedence over global settings. For CI/CD pipelines where configuration files may not be present, the environment variable approach works:
export CLAUDE_CODE_EFFORT=xhigh
Project-level configuration is the recommended approach for teams, since it commits the effort tier alongside the codebase and ensures consistent behavior across developer machines.
Combining xHigh with Model Selection
xHigh produces its best results when pinned explicitly to a specific model. Relying on default model routing can result in a lower-tier model receiving the xhigh instruction, which produces different and often worse results. The model's reasoning architecture must support the extended chain-of-thought that xhigh demands; sending xhigh to a model that lacks that depth leads to longer processing times without corresponding quality gains.
Important: The model identifier used below is a placeholder. Before configuring, confirm the correct model identifier by calling the Anthropic models API (
GET /v1/models) or consulting the official model list. Known Anthropic model identifiers typically follow a date-versioned format (e.g.,claude-3-opus-20240229).
The full project configuration with model pinning:
{
"model": "YOUR_MODEL_ID_HERE",
"effort": "xhigh",
"autoVerify": true,
"verification": {
"commands": [
"npm run typecheck",
"npm run test:affected",
"eslint . --cache --cache-location .eslintcache"
],
"maxRetries": 3
}
}
For ad-hoc sessions from the command line:
claude --model YOUR_MODEL_ID_HERE --effort xhigh
For CI integration via environment variables (verify variable names against claude --help or official docs before CI use):
export CLAUDE_CODE_MODEL=YOUR_MODEL_ID_HERE
export CLAUDE_CODE_EFFORT=xhigh
export CLAUDE_CODE_AUTO_VERIFY=1
Session-Level Effort Overrides
Within an active session, developers can escalate effort for a specific complex task using the /effort xhigh slash command (confirm availability in your installed version), then drop back to high or medium for routine follow-ups. This avoids paying the xhigh token premium for every interaction while preserving access to deep reasoning when needed. Session-level overrides do not persist beyond the current session.
Implementing Auto-Verification Workflows
How the Verification Loop Works Internally
The auto-verification loop follows a fixed sequence of steps (individual outputs remain non-deterministic): code generation, lint and type-check execution, test execution against affected files, output evaluation against expected pass criteria, and finally either self-correction with retry or confirmation of the result.
You configure the maximum retry count with verification.maxRetries. When retries exhaust without all verification steps passing, Claude Code 2.1 bails out and presents the last attempt along with the failing verification output. This prevents infinite loops on fundamentally broken approaches.
Claude Code 2.1 selects which verification steps to run through heuristic detection of project tooling. It inspects package.json scripts, configuration files for common linters and type checkers, and test runner configurations (this heuristic behavior is unverified -- confirm against your installed version). When heuristics fall short, or when the project uses non-standard tooling, explicit configuration is necessary.
Defining Custom Verification Commands
Specify custom verification commands in .claude/settings.json under the verification.commands array. Each entry runs in sequence, and a failure in any step triggers the retry logic:
{
"model": "YOUR_MODEL_ID_HERE",
"effort": "xhigh",
"autoVerify": true,
"verification": {
"commands": [
"npm run typecheck",
"npm run test:affected",
"eslint . --cache --cache-location .eslintcache"
],
"maxRetries": 3
}
}
Warning: Do not use
eslint --fixas a verification command. The--fixflag mutates source files on disk, meaning every verification retry will silently modify your code, corrupting change history and producing unreliable test results. Useeslint . --cache --cache-location .eslintcache(read-only, exits non-zero on errors) for verification. If auto-fix is desired, run it as a separate, explicit step outside the verification loop.
Note:
npm run test:affectedassumes your project has a corresponding script defined inpackage.json(e.g., via Nx, Turborepo, or Jest with--onlyFailures). Similarly,npm run typecheckmust be defined. Verify these scripts exist before configuring. Iftest:affectedis not defined, substitute a safe default such asnpm test.
Ordering matters. Type checking before test execution catches compilation errors early, avoiding wasted test runner cycles. For monorepo setups with multiple verification targets, commands can reference workspace-specific scripts or use tooling like Turborepo or Nx to scope verification to affected packages. For example, in an Nx monorepo: "npx nx affected --target=test".
Auto-Verification in Headless Mode
For batch operations and CI integration, headless mode combines xhigh effort with auto-verification and structured output:
timeout 600 claude \
--headless \
--effort xhigh \
--auto-verify \
--output-format json \
"Refactor the auth module to use dependency injection" \
> verification-report.json
EXIT_CODE=$?
if [[ $EXIT_CODE -eq 124 ]]; then
echo "ERROR: claude invocation timed out after 600s" >&2
exit 1
elif [[ $EXIT_CODE -ne 0 ]]; then
echo "ERROR: claude exited with code ${EXIT_CODE}" >&2
exit "${EXIT_CODE}"
fi
if [[ ! -s verification-report.json ]]; then
echo "ERROR: verification-report.json is empty or missing" >&2
exit 1
fi
Warning: Without explicit error handling after the
claudeinvocation, a failed run will produce an empty or partial JSON file while the shell may report a successful exit code. Always check the exit code and validate that the output file is non-empty.
Safety note: Before running headless refactoring on a production codebase, ensure you are operating on a clean git branch so that all changes can be reviewed and reverted if necessary. Consider running on a small subset of files first to validate behavior.
The JSON report captures each verification step's pass/fail status, retry count, and the final generated output. This structured data integrates directly into CI pipelines.
A GitHub Actions workflow running Claude Code in headless auto-verify mode on pull request events:
name: Claude Code Auto-Verify
on:
pull_request:
types: [opened, synchronize]
jobs:
claude-verify:
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@v4
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install project dependencies
run: npm ci
- name: Install Claude Code
run: |
set -euo pipefail
npm install -g @anthropic/claude-code@2.1.0
# NOTE: Verify the exact npm package name via Anthropic's official
# install documentation or `npm view @anthropic/claude-code`.
# If the package name is incorrect, this step will fail with a 404.
- name: Run Claude Code Verification
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
CLAUDE_CODE_MODEL: ${{ secrets.CLAUDE_CODE_MODEL }}
CLAUDE_CODE_EFFORT: xhigh
CLAUDE_CODE_AUTO_VERIFY: "1"
run: |
set -euo pipefail
timeout 600 claude \
--headless \
--auto-verify \
--output-format json \
"Review and verify all changes in this PR" \
> report.json
EXIT_CODE=$?
if [[ $EXIT_CODE -eq 124 ]]; then
echo "ERROR: claude timed out" >&2; exit 1
fi
if [[ ! -s report.json ]]; then
echo "ERROR: report.json is empty" >&2; exit 1
fi
- name: Upload Verification Report (success)
if: success()
uses: actions/upload-artifact@v4
with:
name: claude-verification-report-success
path: report.json
- name: Upload Verification Report (failure)
if: failure()
uses: actions/upload-artifact@v4
with:
name: claude-verification-report-failure
path: report.json
Note: The
ANTHROPIC_API_KEYandCLAUDE_CODE_MODELare injected via GitHub Actions secrets, which masks them in logs. Never hard-code API keys or model identifiers in workflow files.
Cost Optimization Strategies for xHigh
Understanding the Token Economics
The xhigh tier carries a higher token cost multiplier compared to high due to extended chain-of-thought reasoning that consumes more input and output tokens. Anthropic has not published an exact multiplier; monitor your first few xhigh sessions via --usage to establish your own baseline. Token costs vary by model and usage; consult Anthropic's pricing page for current per-token rates.
Auto-verification retries compound this cost. Each retry cycle regenerates code and re-executes verification, so a task that takes multiple retries before passing will cost proportionally more than a single-pass completion. The --usage summary output at the end of each session (verify this flag exists via claude --help) provides a breakdown of tokens consumed and estimated cost, which should be monitored regularly during initial adoption.
Selective Escalation Patterns
The most cost-effective approach uses high as the default effort tier and reserves xhigh for specific task types where deeper reasoning demonstrably improves outcomes: complex refactors, architectural migrations, multi-file dependency resolution, and tasks where inter-module reasoning is critical.
Prompt structure also influences effective reasoning depth. Well-structured prompts that clearly define scope, constraints, and expected outputs help Claude Code pick the right reasoning depth even at the high tier. Set verification.maxRetries to a cap of 2 to 3 to prevent runaway loops from consuming budget on fundamentally misguided approaches.
Budget Guards and Spend Alerts
You set cost guardrails directly in Claude Code 2.1's configuration:
{
"model": "YOUR_MODEL_ID_HERE",
"effort": "xhigh",
"autoVerify": true,
"maxCostPerSession": 25.00,
"maxCostPerTask": 8.00,
"verification": {
"maxRetries": 3
}
}
Warning: The
maxCostPerSessionandmaxCostPerTaskkeys are unverified. Confirm these keys are implemented in your installed version before relying on them for cost protection. To test, set a threshold below the cost of a known task and observe whether execution halts. Use Anthropic dashboard organization-level limits as your primary cost guardrail regardless.
API-level budget controls are also available through the Anthropic dashboard for organization-wide limits. Teams should set per-developer or per-project caps at the API level.
Production Patterns and Real-World Workflows
Pattern 1: Large-Scale Refactoring with xHigh + Auto-Verify
Consider migrating a 200-file Express.js codebase from CommonJS to ESM. This task involves rewriting require and module.exports statements, updating file extensions, resolving inter-file import paths, handling circular dependencies, and ensuring all existing tests continue to pass.
The prompt structure for this type of multi-file refactoring task must be explicit about scope and constraints:
Migrate the entire src/ directory from CommonJS to ESM modules.
Requirements:
- Convert all require() calls to import statements
- Convert all module.exports to named or default exports
- Update relative import paths to include .js extensions
- Resolve circular dependencies by restructuring where necessary
- Ensure all files in test/ pass after migration
- Do not modify any external dependency imports
- Process files in dependency order, starting from leaf modules
Verification criteria: npm run typecheck && npm run test must both pass.
xHigh's multi-pass planning handles inter-file dependency resolution that lower tiers miss. The auto-verification loop catches broken imports, circular dependency regressions, and test failures, retrying with corrected approaches before presenting the final result.
Pattern 2: Test Generation and Validation Loop
Generating integration tests for an untested API layer benefits from xhigh reasoning. The extended chain-of-thought allows the model to reason about edge cases, failure modes, authentication boundaries, and error response shapes. For instance, high often generates happy-path tests but omits auth-boundary edge cases that xhigh's multi-pass planning catches.
Generate integration tests for all endpoints in src/routes/api/v2/.
Requirements:
- Cover happy path, authentication failure, validation errors, and 404 cases
- Use the existing test setup in test/helpers/setup.ts
- Mock external service calls using the patterns in test/mocks/
- Each test file should be independently runnable
- Target >90% branch coverage for each route handler
Verification: npm run test:integration must pass with all new tests included.
Auto-verification ensures that generated tests actually compile and pass before presenting results. Without it, test generation at any effort tier frequently produces tests that reference nonexistent fixtures or use incorrect assertion patterns.
Pattern 3: Database Migration Script Generation
Generating Prisma migration scripts with rollback verification is another strong xhigh use case. Custom verification commands can include prisma validate or prisma migrate diff to validate migration logic without touching the database. xHigh's advantage here is reasoning about data integrity constraints, foreign key relationships, and the ordering of migration steps to avoid constraint violations during execution.
Anti-Patterns to Avoid
Running xhigh on trivial tasks wastes budget without improving output quality. Boilerplate generation, simple CRUD endpoints, and configuration file edits all produce identical results at medium. Disabling auto-verification to "save time" on complex tasks defeats the purpose of the self-correcting loop and leads to outputs that compile but fail in production. Overly broad prompts ("refactor this whole project to be better") cause xhigh to over-reason and hallucinate scope, generating changes to files that should not be touched.
Disabling auto-verification to "save time" on complex tasks defeats the purpose of the self-correcting loop and leads to outputs that compile but fail in production.
Implementation Checklist and Quick Reference
Prerequisites
- Ensure Node.js is installed (check minimum version requirements in Anthropic's install docs).
- All shell examples assume a POSIX-compatible shell (Linux/macOS). Windows users must adapt
exportto$env:VAR = "value"(PowerShell) orset VAR=value(cmd). - Package name: Verify the correct npm package name via Anthropic's official installation documentation before running
npm install. - Confirm your Anthropic plan includes access to the target model by calling
GET /v1/modelswith your API key and checking that the model identifier appears in the response. - Project tooling: Ensure
npm run typecheck,npm run test:affected, and any other scripts referenced inverification.commandsare defined in yourpackage.json.
Pre-Flight Checklist
- Confirm Claude Code version:
claude --versionshould report 2.1.0 or later - Verify model access by calling
GET /v1/modelswith your API key and confirming the target model identifier appears in the response - Create project-level
.claude/settings.jsonin the repository root with verification commands matching the project's actual toolchain - Configure cost guardrails before starting xhigh workflows (and verify they function -- see warning above)
Configuration Quick Reference Table
Note: All configuration keys, defaults, and CLI flags below are unverified. Confirm against
claude --helpand official documentation.
| Setting | Location | Default | Recommended for xHigh |
|---|---|---|---|
effort |
settings.json / CLI / env | medium (unverified) |
xhigh |
model |
settings.json / CLI / env | auto-routed (unverified) | Pin to a specific model ID |
autoVerify |
settings.json / CLI | false (unverified) |
true |
verification.commands |
settings.json | auto-detected (unverified) | explicit array |
verification.maxRetries |
settings.json | unknown -- verify against installed version | 3 |
maxCostPerSession |
settings.json | none (unverified) | 25.00 (verify key is functional) |
maxCostPerTask |
settings.json | none (unverified) | 8.00 (verify key is functional) |
headless |
CLI flag | false (unverified) |
true (CI only) |
Decision Matrix: When to Use Each Effort Tier
| Task Type | Recommended Effort | Auto-Verify? | Notes |
|---|---|---|---|
| Simple edits / typos | low | No | Minimal reasoning needed |
| Feature implementation | medium or high | Optional | Standard development tasks |
| Multi-file refactoring | xhigh | Yes | Benefits from extended chain-of-thought |
| Codebase migration | xhigh | Yes | Inter-file dependency resolution critical |
| Architecture planning | xhigh | No | Reasoning-heavy but output is prose/design, not executable code -- auto-verify has no meaningful verification target |
Note: Cost multipliers vary by model and usage. Consult Anthropic's pricing page for current per-token rates rather than relying on fixed multiplier estimates.
Complete Reference Configuration
The following .claude/settings.json incorporates all settings discussed throughout this article and serves as a copy-paste starting point:
{
"model": "YOUR_MODEL_ID_HERE",
"effort": "xhigh",
"autoVerify": true,
"maxCostPerSession": 25.00,
"maxCostPerTask": 8.00,
"verification": {
"commands": [
"npm run typecheck",
"npm run test:affected",
"eslint . --cache --cache-location .eslintcache"
],
"maxRetries": 3
}
}
Before using this configuration:
1. ReplaceYOUR_MODEL_ID_HEREwith the correct model identifier from Anthropic's models API.
2. Verify thatmaxCostPerSessionandmaxCostPerTaskare functional in your installed version.
3. Confirm allverification.commandsscripts exist in yourpackage.json.
4. Do not useeslint --fixin the verification commands array -- useeslint . --cache --cache-location .eslintcachefor read-only checking.
Substitute the appropriate commands for projects using different test runners, linters, or type checkers. Calibrate cost thresholds based on observed usage during the first week of xhigh adoption.

