Lesson 6: Module 06: Building a ReAct Agent

Combine reasoning, tools, state, and loops into a full Reason-Act-Observe agent.

Learning Objectives

Understand the ReAct (Reason-Act-Observe — a pattern where the AI reasons, takes an action, observes the result, and repeats) loop
Build an agent with explicit tool dispatch (routing the AI’s chosen action to the right tool machine) using match/case_of (pattern matching — routing execution based on a value)
Track agent state (session-scoped data that persists across iterations) across iterations
Control loop termination

Complexity Ladder: Level 5 (Agentic) — the full ReAct pattern with tools, state, and iteration.

The Concept: The Reason-Act-Observe Loop

In Module 03, you learned that a ask step with tools IS an agent loop. That’s the simple version — Mashin handles everything inside one step. But for production agents, you often want more control:

Explicit state (session-scoped data) tracking between iterations
Custom tool dispatch (routing the AI’s chosen action to the right tool machine) logic
Observation processing before the next reasoning round
Iteration limits and loop detection

Think of a ReAct agent like a detective solving a case. The detective reasons about what they know, decides to investigate a lead (act), reviews what they found (observe), then reasons again about the next step. This continues until they have enough evidence to present their conclusion.

The ReAct pattern (Reason-Act-Observe) makes this explicit:

Reason — The LLM analyzes the situation and decides what to do
Act — Execute the chosen tool
Observe — Process the result and update state
Loop — Go back to Reason, or respond if done

This is what MashinClaw uses internally — Mashin’s built-in AI coding assistant, itself a machine written in Mashin (see examples/mashinclaw/agent.mashin).

Here is how all the phases connect inside the iterate flow:

  flow :iterate
┌─────────────────────────────────────────────────┐
│                                                 │
│  Phase 1         Phase 2         Phase 3        │
│ ┌──────────┐   ┌──────────┐   ┌──────────┐     │
│ │ CHECK    │──►│ REASON   │──►│ CAPTURE  │     │
│ │ LIMIT    │   │          │   │          │     │
│ │          │   │ LLM sees │   │ Record   │     │
│ │ iteration│   │ history  │   │ action & │     │
│ │ exceeded?│   │ + tools  │   │ thought  │     │
│ └──────────┘   │ decides  │   │ in state │     │
│                │ next act │   └─────┬────┘     │
│                └──────────┘         │          │
│                                     ▼          │
│  Phase 5         Phase 4    ┌──────────────┐   │
│ ┌──────────┐   ┌──────────┐ │   DISPATCH   │   │
│ │ OBSERVE  │◄──│ TOOL     │◄┤   match/     │   │
│ │          │   │ EXECUTE  │ │   case_of    │   │
│ │ Process  │   │          │ │              │   │
│ │ result   │   │ :call to │ │ "glob" →     │   │
│ │ update   │   │ stdlib   │ │ "grep" →     │   │
│ │ history  │   │ machine  │ │ "read" →     │   │
│ └────┬─────┘   └──────────┘ │ "respond" →  │   │
│      │                      └──────────────┘   │
│      │                                         │
│      └──── goto flow(:iterate) ───────────────┘
│            (loop back for another round)
│
│  OR: if action == "respond" → return to :main
└─────────────────────────────────────────────────┘

The loop continues until the LLM chooses “respond” (meaning it has enough information) or the iteration limit is hit.

Start With Koda

Koda requires a free account. Sign in or create an account to use Koda exercises throughout this course. If you’re not signed in yet, read on; the exercises will be here when you’re ready.

Ask Koda:

“Build an agent that can read files and answer questions about a codebase. It should be able to search for files with glob, search contents with grep, and read files. Give it a maximum iteration limit.”

Compare Koda’s output with the file explorer below. Check that it has: state (session-scoped data) for iteration tracking, a ask step with output_schema, match/case_of (pattern matching) for tool dispatch (routing actions to tools), and goto (tail-call — a jump that replaces the current flow) for looping.

The Agent Architecture

Here’s how a ReAct agent maps to Mashin constructs:

flow :main
  ├── Load context (memory, conversation)
  ├── run flow(:iterate)        ← Enter the ReAct loop (call-and-return)
  └── Post-loop cleanup (store memory, format output)

flow :iterate
  ├── Check iteration limit      ← Safety: prevent infinite loops
  ├── Reason (LLM decides)       ← :reason step with output_schema
  ├── Capture reasoning           ← Update state with action + thought
  ├── Tool dispatch              ← match/case_of routes to tool
  ├── Observe result             ← Process tool output, update history
  └── goto flow(:iterate)       ← Loop back (tail-call — jump, don't nest)
      OR return (if action == "respond")

Building It: File Explorer Agent

Let’s build an agent that can read files and answer questions about a codebase.

machine file_explorer "File Explorer Agent"

accepts
  question as string, is required    // The user's question
  max_iterations as integer, default: 8    // Safety limit on loop iterations

responds with
  answer as string                   // Pull final answer from state
  files_read as list                 // List of files the agent read
  iterations as integer              // How many loops it took

implements
  state
    iteration: integer, default: 0               // Current loop count
    action_history: list, default: []            // Record of what the agent has done so far
    files_read: list, default: []                // Files accessed during exploration
    final_response: string, default: ""          // The agent's eventual answer

  flow main
    run flow(iterate)                            // Enter the ReAct loop (call-and-return)

  flow iterate
    // Phase 1: Check iteration limit
    compute check_limit                          // Pure computation, no I/O
      """
      iteration = state(:iteration) + 1
      exceeded = iteration > input(:max_iterations)

      if exceeded do
        %{
          state: %{
            iteration: iteration,
            final_response: "I reached the iteration limit. Based on what I found: " <>
              inspect(state(:action_history))
          },
          exceeded: true
        }
      else
        %{state: %{iteration: iteration}, exceeded: false}
      end
      """

    // Phase 2: Reason about what to do next
    ask reason, using: "anthropic:claude-sonnet-4"    // LLM inference step
      condition not step(check_limit, :exceeded)
      with role """
        You are a code exploration agent. You can read files, search for files,
        and search file contents to answer questions about a codebase.

        Available actions:
        - glob: Find files matching a pattern (args: pattern)
        - grep: Search file contents for a pattern (args: pattern, path)
        - read_file: Read a file's contents (args: path)
        - respond: Give your final answer (args: response)

        Be systematic: first find relevant files, then read them, then answer.
        """
      with task """
        Question: ${input.question}

        Files read so far: ${state.files_read}
        Previous actions: ${state.action_history}
        Iteration: ${state.iteration} of ${input.max_iterations}

        What should you do next?
        """
      returns
        action as string, is required, choices: ["glob", "grep", "read_file", "respond"]
        action_input as map, default: %{}
        thought as string, is required
        response as string

    // Phase 3: Capture reasoning into state
    compute capture
      condition not step(check_limit, :exceeded)
      """
      action = step(:reason, :action)
      thought = step(:reason, :thought)
      action_input = step(:reason, :action_input) || %{}

      entry = %{
        "iteration" => state(:iteration),
        "action" => action,
        "thought" => thought,
        "input" => action_input
      }

      history = state(:action_history) ++ [entry]
      state_update = %{action_history: history}

      state_update = if action == "respond" do
        Map.put(state_update, :final_response, step(:reason, :response) || "")
      else
        state_update
      end

      %{state: state_update, action: action, action_input: action_input}
      """

    // Phase 4: Tool dispatch
    match step(capture, :action)
      case_of "glob"
        ask tool_exec, from: "@mashin/actions/tools/glob"
          pattern: step(capture, :action_input)[:pattern]

      case_of "grep"
        ask tool_exec, from: "@mashin/actions/tools/grep"
          pattern: step(capture, :action_input)[:pattern]
          path: step(capture, :action_input)[:path]

      case_of "read_file"
        ask tool_exec, from: "@mashin/actions/tools/read_file"
          path: step(capture, :action_input)[:path]

      otherwise
        compute noop
          {skipped: true}

    // Phase 5: Observe and loop back
    compute observe
      condition step(capture, :action) != "respond" and not step(check_limit, :exceeded)
      """
      tool_result = step(:tool_exec)
      action = step(:capture, :action)

      files_read = state(:files_read)
      files_read = if action == "read_file" do
        path = step(:capture, :action_input)[:path] || ""
        files_read ++ [path]
      else
        files_read
      end

      observation = %{
        "iteration" => state(:iteration),
        "tool" => action,
        "observation" => inspect(tool_result)
      }

      history = state(:action_history) ++ [observation]
      %{state: %{action_history: history, files_read: files_read}}
      """

    // Loop back if not done
    compute loop_back
      condition step(capture, :action) != "respond" and not step(check_limit, :exceeded)
      """
      goto flow(:iterate)
      """

Phase-by-Phase Breakdown

Phase 1: Check Limits

Every iteration starts by checking whether we’ve exceeded the maximum. This prevents infinite loops. If exceeded, we set a fallback response and skip the rest.

Phase 2: Reason

The ask step is the brain. It sees the question, what it’s already done (action_history — a record of what the agent has done so far), and what tools are available. It decides the next action using output_format to ensure a structured response.

Phase 3: Capture

A code step extracts the LLM’s decision into state (session-scoped data). If the action is “respond”, we capture the final response. Otherwise, we record the action for the history.

Phase 4: Dispatch

match/case_of (pattern matching — routing execution based on a value) routes to the right tool machine based on the chosen action. Each tool is a separate :call step invoking a stdlib machine. The otherwise block handles the “respond” case (no tool needed).

Phase 5: Observe and Loop

If the agent used a tool (not “respond”), we process the result, append it to the action_history (a record of what the agent has done so far), and goto flow(:iterate) to loop back for another round.

If the agent chose “respond”, the condition (a guard that decides whether a step runs) on observe and loop_back prevents them from running. The flow returns naturally, and control flows back to :main (which called run flow(:iterate)).

Key Design Patterns

State as working memory: The action_history state (session-scoped data) slot accumulates the agent’s entire reasoning trace. Each iteration appends both the action and the observation, giving the LLM full context in the next round.

run for the loop entry, goto for iteration: The :main flow uses run flow(:iterate) — call-and-return. Steps after the run execute when the loop finishes. Inside :iterate, goto flow(:iterate) is a tail-call (a jump that replaces the current flow rather than nesting inside it) — it loops without stack accumulation.

Conditional execution gates: Steps use condition (a guard that decides whether a step runs) to skip when they shouldn’t run (e.g., don’t observe if responding, don’t reason if limit exceeded). This keeps the flow linear while supporting branching behavior.

Key Syntax

// ReAct (Reason-Act-Observe) agent structure
implements
  flow main
    run flow(iterate)              // Enter loop (call-and-return)
    compute cleanup
      ...                          // Runs after loop completes

  flow iterate
    compute check
      ...                          // Phase 1: check iteration limit

    ask reason, using: "anthropic:claude-sonnet-4"    // Phase 2: LLM decides next action
      condition ...                // Skip if limit exceeded
      returns
        action as string, choices: [...]    // Constrain to known actions

    match step(capture, :action)   // Phase 4: tool dispatch
      case_of "tool_name"
        ...                        // Each case routes to an ask...from step
      otherwise
        ...                        // Fallback for "respond" / unknown

    compute loop                   // Phase 5: loop back
      condition ...                // Only loop if not done
      """
      goto flow(:iterate)          // Tail-call loop (jump, no stack growth)
      """

Common Mistakes

Not tracking action history. Without action_history (a record of what the agent has done so far), the LLM doesn’t know what it’s already tried. It will repeat the same actions. Always pass previous actions in the prompt.
Missing iteration limits. Without a max, a confused LLM will loop forever. Always check state(:iteration) against a limit at the start of each cycle.
Using run instead of goto for the loop. run flow(:iterate) inside :iterate would accumulate stack frames (memory used by each nested call). Use goto flow(:iterate) for the tail-call (jump without nesting) loop.

What’s Next

In Module 07, you’ll learn how to compose multiple machines together — building specialist agents that a coordinator orchestrates.