Agentic AI SDLC Methodology¶
Standard ID: AGENTIC-AI-SDLC Version: 1.0 Effective: 2026-04-03 Owner: CDO
1. Vision¶
AI-first software development where Claude Code is the primary development tool. Engineers direct AI agents that write, test, review, and document code. Human judgement governs architecture, security, and business decisions.
2. Core Principles¶
- AI generates, humans govern. AI handles implementation; humans own decisions.
- Standards enable autonomy. Well-defined standards let agents work with minimal supervision.
- Everything is traceable. All AI contributions are attributed, reviewed, and auditable.
- Quality is non-negotiable. AI-generated code meets the same gates as human-written code.
3. Toolchain¶
| Tool | Purpose |
|---|---|
| Claude Code | Primary development agent — code generation, review, refactoring, testing, documentation |
Beads (bd CLI) |
Issue and task tracking — all work items, dependencies, status |
| Engram | Cross-session persistent memory — decisions, context, preferences |
| Bitbucket | Source control, pull requests, CI/CD pipelines |
4. Agent Workflow Patterns¶
Single Agent¶
For focused, bounded tasks.
- Bug fixes with clear reproduction steps.
- Documentation updates.
- Lint/format fixes.
- Simple feature additions with existing patterns.
Parallel Agents¶
For independent work streams that do not conflict.
- Independent feature branches across different services.
- Writing test suites for separate modules.
- Multi-file refactoring with no shared dependencies.
- Generating SDK code for multiple languages.
Orchestrated Agents¶
For complex work with dependencies between tasks.
- Multi-service features requiring coordinated API changes.
- Phased migrations (schema + application + data).
- Cross-cutting concerns (observability, auth changes).
Orchestration rules: - Define a dependency graph in Beads before starting. - Each agent claims and works one issue at a time. - Agents check dependencies are closed before starting dependent work.
5. Human-in-the-Loop Gates¶
The following require mandatory human review and approval:
| Gate | Approver |
|---|---|
| Architecture Decision Records (ADRs) | CDO + Tech Lead |
| Security-sensitive changes (auth, encryption, PII handling) | CDO + Security Lead |
| Compliance and regulatory changes | CDO + Compliance |
| Production deployments | CDO or delegated Release Manager |
| Merchant-facing API changes (breaking or additive) | CDO + Product Lead |
| Database schema changes to shared databases | CDO |
| Third-party vendor integrations | CDO |
6. AI Autonomy Zones¶
AI agents may proceed without human approval for:
- Code generation following established patterns.
- Writing and updating unit/integration tests.
- Documentation generation and updates.
- First-pass code review (before human review).
- Refactoring within a single service boundary.
- Dependency version updates (non-major).
- Lint and formatting fixes.
- Generating OpenAPI spec updates from code.
7. Quality Gates¶
All AI-generated code must pass before merge:
- Compilation/build — no errors.
- Linting — zero warnings (golangci-lint for Go).
- Unit tests — 100% pass, coverage not decreased.
- Integration tests — all pass.
- Security scan — no high/critical findings.
- OpenAPI spec validation — if API changes.
- AI review — Claude Code review completed.
- Human review — at least 1 human reviewer approved.
8. Code Review Protocol¶
- AI reviews first: Claude Code analyses the diff for correctness, style, security, and test coverage.
- Human reviews second: Focus on business logic, architectural fit, security implications, and edge cases the AI may miss.
- AI addresses feedback: Claude Code implements review comments, human verifies.
9. Session Management¶
- Every work session begins by checking
bd readyfor available work. - Agents claim issues with
bd update <id> --claimbefore starting. - Context persists across sessions via Engram — decisions, blockers, and progress.
- Every session ends with: issues updated, code committed, pushed to remote.
10. Metrics¶
Track these to measure AI-augmented development effectiveness:
| Metric | Target |
|---|---|
| Time to merge (PR open to merged) | < 4 hours |
| Defect escape rate | < 2% of merged PRs |
| AI contribution ratio | > 70% of code by volume |
| First-pass review issues | < 3 per PR |
| Session completion rate | 100% (all sessions end with push) |
11. Risks and Mitigations¶
| Risk | Mitigation |
|---|---|
| AI hallucination (incorrect code) | Comprehensive test suites verify all generated code |
| Security vulnerabilities | Mandatory human review gate for security-sensitive changes |
| Inconsistent patterns | Standards documents + linting enforce consistency |
| Context loss between sessions | Engram persists context; Beads tracks issue state |
| Over-reliance on AI | Engineers must understand all code they approve |
| Prompt injection via dependencies | Pin dependencies, audit updates, scan for malicious code |