Resuming failed runs
Re-spawn a failed pipeline run, preserving terminal step state, with --resume <id> or --resume last-failed.
When a pipeline run fails — a flaky test, a syntax error in the implement output, a transient API hiccup — you don't have to re-run the whole pipeline. contractor run --resume re-spawns the run and skips every step that already finished, so the agent picks up where the failure happened.
This guide covers the two --resume forms, what state carries forward, what doesn't, and when to start fresh instead.
The two forms
# Resume a specific pipeline run by id.
contractor run --resume run_abc123
# Resume the most recent failed run for the current repo.
contractor run --resume last-failedBoth forms spawn a new pipeline-shim process whose initial step state is hydrated from the parent (the failed run). The new run has its own pipeline_run_id — resume creates a fresh row, it does not mutate the failed one.
last-failed resolves to the most recent failed pipeline run for the current repo (getLatestFailedPipelineRun). If there is no failed run, the command exits 1 with No failed pipeline runs found for this repo.. If you have multiple failed runs and want a specific one, pass the explicit id.
What gets carried forward
The hydrator (hydratePipelineStateFromParent) walks the parent's child runs and normalizes each step's status:
| Parent step status | Carried forward as | Re-runs? |
|---|---|---|
completed | completed | No |
skipped | skipped | No |
cancelled | cancelled | No |
failed | pending | Yes |
running (orphaned) | pending | Yes |
pending | pending | Yes |
Once hydration is done, findResumeIndex picks the first non-terminal step and the new shim picks up there. In practice that means the failed step re-runs against the worktree's current state, with every step before it preserved.
Two other things carry forward:
- Preapproved gates. If you preapproved a gate in the failed run (via
--gate-afteror the dashboard's gate-preapproval UI), the preapproval list is copied to the new run. You don't have to re-approve. - Step ids, not positions. Hydration matches by step id, not by index. Even if the pipeline shape changed between runs (a re-expanded
task-group-implementphase drops completed groups, for example), the step status map stays correct.
What does NOT carry forward
- The agent session. Each agent step in the resumed run gets a fresh agent session. Contractor does not chain
claude --resume <session-id>across runs. If you want to continue a specific agent session manually, use theclaude --resume <session-id>hint contractor prints on agent failure. - The worktree state at failure time. Resume does not snapshot the worktree. If you edited files between failure and resume, the new run sees your current edits. That is usually what you want — fixing the failure is the point — but it means the resumed step starts against your edits, not the state that existed at failure.
- Per-run flags. Flags from the parent run are not inherited. If the parent ran with
--model opus[1m], the resumed run uses the default unless you re-pass--model opus[1m]. The same applies to--effort,--backend,--pipeline, and--gate-after.
Constraints
--resume only accepts a failed parent. The CLI rejects every other terminal status:
completed— the run already finished. There is nothing to resume.cancelled— the run was killed (SIGINT / dashboard cancel). Cancellation is treated as deliberate; if you want to retry, start a fresh run.running— the parent is still in flight. Wait for it, or cancel it from the dashboard.
If --resume run_xyz rejects with Pipeline run "run_xyz" has status "<status>". Only failed runs can be resumed., the parent is not in a state that can be resumed.
The resume hydrator also refuses to proceed if the parent's repo or worktree directory no longer exists — for example, the worktree was removed manually or the close phase ran far enough to delete it. In that case, start fresh.
Example: resume an implement-then-review pipeline
A common scenario: the implement phase succeeds and produces a clean commit, but the review phase fails on a flake or a transient model error.
$ contractor run
... [implement step succeeds, commits work] ...
... [review step fails on transient error] ...
[error] Pipeline failed at step "review" (run_abc123)
$ contractor run --resume last-failed
Resuming from last failed run: run_abc123
Started pipeline-shim pid=12345 runId=run_def456
... [implement step skipped: completed] ...
... [review step re-runs] ...
... [gate step pauses for approval] ...
The implement commit is not re-attempted — the agent's commit is on the worktree branch, hydrated completed status keeps it from re-running, and the review picks up against the post-implement state. The new run id (run_def456) is independent of the parent (run_abc123); the parent stays around for inspection.
When to start fresh instead
Resume is the right tool when the failure is environmental or transient, or when the failed step is doing the right thing and just needs another chance after a code edit. Start fresh (a normal contractor run) when:
- You suspect the worktree state is wrong. A bad commit landed, a partial implement got committed, the working tree is in a state you don't trust. Reset the branch and start a new run rather than letting the agent pick up against a broken base.
- You changed the pipeline shape and want the new shape from step zero. Resume preserves the old run's step ids; new shell or agent steps that didn't exist in the parent will run, but if you want to re-run something the parent marked
completed(because you deleted its commit, say), the cleanest path is a fresh run. - You changed
--model,--effort, or--pipeline. Resume does not inherit parent flags. A fresh run with new flags is clearer than--resumewith re-applied flags. - The parent was
cancelled. Cancellation is deliberate; the--resumepath is failure-only.
For a single agent session that crashed mid-step, the claude --resume <session-id> hint contractor prints on agent failure remains available as a manual alternative — that's a session-level resume, not a pipeline-level one.
See also
- Pipelines §
--resume— the same mechanics from the concepts side. - CLI reference §
contractor run— every flag the run command accepts. - Customizing pipelines — define the pipelines you'll be resuming.