9  Durable Execution Patterns

Tactus workflows are meant to run for a long time: with retries, pauses, and human interactions. These patterns keep them reliable.

9.1 The Bounded Agent Loop

The “Nutshell default” for any agent is: loop until a completion tool fires, but cap iterations.

local done = require("tactus.tools.done")

repeat
  worker()
until done.called() or Iterations.exceeded(20)

9.2 Checkpointing Arbitrary Work

Agent turns and tool calls are checkpointed, but you can also checkpoint your own deterministic work:

local value = Step.checkpoint(function()
  return expensive_operation()
end)

Use this for:

  • file enumeration + reads
  • expensive preprocessing
  • external calls you wrap as deterministic tools

9.3 Retry and Backoff

local result = Retry.with_backoff(function()
  return flaky_call()
end, {retries = 3, base_delay = 0.5})

9.4 Stop Requests

if Stop.requested() then
  Log.warn("Stopping", {reason = Stop.reason()})
  return {result = "stopped", success = false}
end

9.5 The “Summarize Tool Results” Turn

When an agent uses tools heavily, add a tool-free summarization turn to keep context small:

worker()
if search.called() then
  worker({message = "Summarize tool results.", tools = {}})
end

9.6 Idempotency Rule

If a line of code might run again after a resume, it must be safe to run twice.

  • Put nondeterminism behind tools.
  • Cache results with Step.checkpoint(...).
  • Write outputs in a “commit” step after validation/review.