Concepts
The three loops and the artifact graph that drive every contractor change.
contractor's model has two moving parts: an artifact graph that structures each change, and three nested loops that actually move work through it.
The artifact graph
Every change — a blueprint — progresses through four artifacts in dependency order:
proposal ──┬── requirements ──┬── tasks
└── design ────────┘
- Proposal — Why the change exists and what it does, in prose.
- Requirements — Behavioral specs, organized by capability. Uses RFC 2119 keywords (SHALL / MUST / MAY) and Given-When-Then scenarios so the acceptance criteria are explicit.
- Design — Technical decisions and architecture: what you'll build, what you won't, and the trade-offs.
- Tasks — Implementation checklist, grouped into numbered sections with trackable
- [ ]checkboxes.
Artifacts are blocked until their dependencies complete, ready when they can be written, and done once their files exist on disk. requirements and design are siblings — both depend on the proposal, neither depends on the other — so they can be authored in parallel. tasks waits for both.
This graph lives in a schema. The shipped default is contractor-base, set in contractor/config.yaml:
schema: contractor-baseThe three loops
Three loops run at different time scales, each with a distinct purpose.
Inner loop — inside a task-group session
This is where development actually happens. Inside a single task-group, the implement agent writes code, picks an appropriate test command from the repo's verify guide, runs it, observes output, fixes failures, and iterates until the tests are green — before it marks checkboxes or commits.
write → run verify → observe → fix → repeat → mark [x]
Two things matter about this loop:
- Failures come from the test runner or the compiler, not from LLM self-review. The agent iterates against real execution signal.
- Repo-specific conventions live in
contractor/VERIFY.md, authored by humans and inlined directly into the implement agent's prompt on every run. That's where you document the default test command, monorepo subset variants, flaky suites, and test layout.
When the group's tests pass and every checkbox is marked, the agent commits and exits. The pipeline enforces one commit per group: exiting without a new commit is treated as failure.
Outer loop — blueprint lifecycle across sessions
This is the human-governed arc of a whole change: spec authoring, parallel review by fresh agents, and archival. Each phase is a slash command or a pipeline step.
propose → implement → review → close
│ │ │ │
│ │ │ └─ merge branch, archive blueprint,
│ │ │ merge requirements into source of truth
│ │ │
│ │ └─ four parallel agents (reuse, quality,
│ │ efficiency, blueprint compliance), fix issues
│ │
│ └─ task-group agents run the inner loop per group,
│ commit per group when verify is green
│
└─ create artifacts: proposal → requirements + design → tasks
Gates between phases pause for user confirmation. You approve them from the CLI or the dashboard.
Middle loop — cross-session memory
Agents notice things they can't (or shouldn't) act on mid-task: pre-existing code issues outside scope, undocumented conventions, task descriptions that didn't match reality, workflow friction. These are observations, written to contractor/.observations/<change>/<phase>.json.
agent phases ─────────────────► retro
│ │
│ RETRO_TAIL appended to │ reads .observations/
│ strategy templates; │ presents items to user,
│ agents write observations │ applies code fixes and
│ to .observations/ │ rule updates, cleans up
│ │
└───────── next session ────────┘
Between sessions, /contractor:retro presents the accumulated observations and the user decides which to apply, convert into rules, or dismiss. Retro is deliberately user-driven — the agent does not self-modify the repo based on its own observations.
How pipelines advance work
A pipeline chains phases into an automated sequence. The default pipeline is implement → review → close. You run one with contractor run or from the TUI dashboard.
Inside the implement phase, the runner inspects tasks.md and expands it into one sub-step per incomplete task-group. Each sub-step is a fresh agent invocation scoped to that single group. The pipeline enforces commit-per-group so each group is a reviewable unit in git history.
Review runs four review agents in parallel — reuse, quality, efficiency, and blueprint compliance — against the full branch diff. Their findings surface as a single consolidated report for you to act on.
Close merges the branch back to main, archives the blueprint into contractor/archive/, and folds its requirements/<capability>/spec.md deltas into the repo-wide source of truth at contractor/requirements/.
Directory layout
After contractor repo install:
contractor/
├── config.yaml # Project configuration (schema, context, rules)
├── requirements/ # Accumulated requirements (source of truth)
│ └── <capability>/spec.md
├── blueprints/ # Active blueprints
│ └── <name>/
│ ├── proposal.md
│ ├── requirements/<capability>/spec.md
│ ├── design.md
│ └── tasks.md
├── .worktrees/ # Git worktrees (gitignored)
├── .observations/ # Agent observations (gitignored)
└── archive/ # Completed blueprints
Run state (runs, events, gates, known repos) lives in a single global SQLite database at ~/.contractor/contractor.db, shared across every repo on the machine. Runtime logs and pipeline step state live under ~/.contractor/projects/<encoded-repo>/{logs,runs}/, keyed by the main repo root so every worktree collapses onto the same directory.