Skip to content

The Cognitive Engine: Choosing the Right Reasoning Loop

Part 1 of the Engineering the Agentic Stack series

Building production AI agents is no longer about prompt engineering—it's about system engineering. The difference between a demo that impresses and a product that delivers comes down to one critical decision: how your agent thinks.

This post introduces three reasoning loop architectures and shows you how to choose between them. I'll use a production-grade Market Analyst Agent as the running example, with code you can use today.

The Shift from Prompts to Systems

A year ago, making an LLM useful meant crafting the perfect prompt. Today, that's table stakes. The agents that work in production are systems—carefully orchestrated graphs of reasoning, tool use, and memory.

🧠 At the heart of every agent system sits the cognitive engine: the reasoning loop that determines how your agent thinks, acts, and adapts. Choose wrong, and you'll burn tokens on unnecessary LLM calls, frustrate users with latency, or watch your agent crumble at the first unexpected tool result.

Three Reasoning Patterns

Reasoning Patterns Comparison

ReAct: Think, Act, Observe, Repeat

Considered the foundational design pattern for interactive AI agents (Yao et al., 2022), ReAct mimics human problem-solving through a continuous cycle:

  1. Thought: The agent generates a "thought" to break down the goal and plan the next step.
  2. Action: Based on the thought, it calls a tool.
  3. Observation: The agent sees the result, which updates its understanding.

ReAct Pattern

Pros:

  • Reduced Hallucinations: Grounding in observations reduces the likelihood of making up facts.
  • Adaptability: The agent can adjust its strategy on the fly based on previous steps.
  • Explainability: The "scratchpad" offers a transparent audit trail of the agent's reasoning.

Cons:

  • Latency and Cost: History accumulates and must be re-processed at every step.
  • Inefficient for predictable tasks: Patterns like ReWOO are better when tool calls can be planned upfront.
  • Looping structure: Can act indefinitely without proper constraints.

🔍 Best for: Exploratory tasks, debugging, situations requiring real-time adaptation.

ReWOO: Plan Everything Upfront

ReWOO (Reasoning WithOut Observation) is an optimization of the ReAct paradigm designed to improve efficiency by decoupling reasoning from tool execution. While ReAct stops to observe the result of every action, ReWOO plans the entire sequence of tool calls in a single pass.

  1. Plan (Upfront Reasoning): The agent generates a full plan of tool calls using variable placeholders (e.g., #E1, #E2) to represent future outputs.
  2. Worker (Execution): A separate module executes the planned tool calls in sequence (or parallel), filling the placeholders with real data.
  3. Solver (Synthesis): A final LLM call takes all gathered observations and synthesizes the final answer.

ReWOO Pattern

Pros:

  • Token Efficiency: Can achieve up to 5x token efficiency by avoiding repetitive "Thought-Action-Observation" loops.
  • Low Latency: Eliminates the need to re-submit history at every step.
  • Modular Training: The planner can be fine-tuned independently without a live environment.

Cons:

  • Rigidity: Assumes tools will work as expected; less forgiving of unexpected results.
  • Predictability Requirement: Best for routine, templated workflows.
  • Error Handling: A naive agent may fail if a plan is flawed unless fallback logic is added.

Best for: Quick snapshots, status checks, dashboards—tasks with predictable tool results.

Plan-and-Execute: The Best of Both

Plan-and-Execute sits at the sweet spot. The paper defines the core logic that modern agent architectures follow:

  1. Planning Phase: The agent first generates a plan to divide a task into smaller sub-tasks.
  2. Execution Phase: The agent then carries out those sub-tasks according to the generated plan.

While the original paper focused on zero-shot prompting, modern implementations (like LangGraph) have expanded this into a full orchestration pattern with sequential execution and model specialization.

Plan-and-Execute Pattern

Pros:

  • Hierarchical reasoning: Mimics how experts work
  • Dynamic re-planning: Can pause and reassess if a step returns unexpected results
  • Model efficiency: Use a high-reasoning model for planning and a cheaper one for execution
  • Resilience: Bounded complexity with clear checkpoints

Cons:

  • Higher latency: Sequential execution steps are slower than ReWOO
  • Implementation complexity: Requires nuanced state management
  • Overkill: Too heavy for simple queries

🎯 Best for: Complex multi-step analysis, research tasks, anything requiring synthesis.

The Decision Framework

Feature ReAct (2022) Plan-and-Execute (2023) ReWOO (2023)
Core Philosophy Improviser: Act, then decide what to do next based on the result. Architect: Build a full blueprint, execute it, then review. Optimizer: Create a "script" with variables and run it all at once.
Workflow Iterative loop: Thought → Action → Observation. Two-stage: Phase 1 (Planning), Phase 2 (Execution). Decoupled: Planner creates a graph of tool calls; Worker runs them.
Adaptability Highest: Can change direction after every single tool call. Medium: Typically re-plans only after a set of steps is completed. Lowest: Usually follows the initial script unless the Solver fails.
Efficiency Low: High token usage; must re-read entire history for every step. Medium: Saves tokens by not "re-thinking" during execution. High: Minimal LLM calls; can parallelize tool execution for speed.
Best For Open-ended exploration or tasks where results are unpredictable. Long-horizon tasks that require a steady goal (e.g., writing a paper). Structured, repeatable workflows (e.g., checking weather in 5 cities).

1. ReAct: The "Think-as-you-go" Pattern

  • How it feels: Like a human debugging a problem. "I'll try this... okay, that didn't work, let me try that instead."
  • Advantage: It handles "unknown unknowns" perfectly. If a search result reveals a new topic, ReAct can immediately pivot.
  • Weakness: It is prone to "looping" (repeating the same failed action) and is the most expensive in terms of token costs.

2. Plan-and-Execute: The "Mission-Oriented" Pattern

  • How it feels: Like a project manager. "Here is the 5-step plan. Let's do steps 1 through 5, then see if we're done."
  • Advantage: Prevents the agent from getting "distracted" by minor details. It maintains a high-level view of the goal, which leads to better success rates on very complex tasks.
  • Weakness: If Step 1 fails in a way that makes Steps 2-5 impossible, the agent may waste time finishing the "broken" plan before re-evaluating.

3. ReWOO (Reasoning Without Observation): The "Compiler" Pattern

  • How it feels: Like writing a computer program. "I need data from Tool A and Tool B, then I'll combine them in Tool C."
  • Advantage: Massively faster and cheaper. Because it plans everything upfront using placeholders (e.g., #E1 for the first tool's output), it doesn't need to call the LLM again until all data is gathered.
  • Weakness: It is "blind" during execution. If the result of the first tool is "I can't find that person," a ReWOO agent will still blindly try to execute the next steps that depend on that person existing.

Summary: Which to choose?

  • Choose ReAct if your agent is chatting with a user and needs to be highly flexible.
  • Choose Plan-and-Execute if you are automating a large, complex job (like a multi-step research report).
  • Choose ReWOO if you have a predictable pipeline and want to save 80% on your API bill.

Real-World Context: The Market Analyst Agent

To see these patterns in action, I built a Market Analyst Agent (available in my demo repository). This production-grade agent demonstrates all three reasoning patterns in a realistic market research context.

The agent uses LangGraph to orchestrate these patterns, with a shared state foundation that captures everything needed across different execution modes:

Plan-and-Execute Architecture

State Definition

The foundation is a well-structured state that captures everything the agent needs:

class PlanStep(BaseModel):
    """A single step in the research plan."""
    step_number: int
    description: str
    tool_hint: str | None = None
    completed: bool = False
    result: str | None = None

class AgentState(BaseModel):
    """Main state for the Market Analyst Agent graph."""

    # Message history with LangGraph's add_messages reducer
    messages: Annotated[list, add_messages] = Field(default_factory=list)

    # Execution mode (set by router)
    execution_mode: ExecutionMode | None = None

    # Plan-and-Execute state
    plan: list[PlanStep] = Field(default_factory=list)
    current_step_index: int = 0

    # ReWOO state
    rewoo_plan: list[ReWOOPlanStep] = Field(default_factory=list)

    # Research results
    research_data: ResearchData | None = None

    # HITL output
    draft_report: DraftReport | None = None
    report_approved: bool = False

Pattern 1: Plan-and-Execute Implementation

The Plan-and-Execute pattern is perfect for complex research tasks that require multi-step synthesis. The key is separating the "planning" phase from the "execution" phase—using a powerful model to think strategically upfront, then a ReAct loop to execute each step adaptively.

How this reflects the Plan-and-Execute pattern:

  1. Single upfront planning phase: One LLM call creates the entire plan as a list of text descriptions
  2. Structured output: Uses Schema-Guided Reasoning to guarantee valid JSON
  3. No tool execution yet: The planner only decides what to do, not how
  4. Human-readable steps: Each step is a description that an executor will interpret
# System prompt guides the LLM to think like a research analyst
# creating a strategic plan, not immediate tool calls
PLANNER_SYSTEM_PROMPT = """You are a senior investment research analyst.
Break down stock analysis requests into 4-6 research steps covering:
1. Current price and basic metrics
2. Recent news and announcements
3. Competitor analysis (if relevant)
4. Financial health assessment
5. Risk factors
6. Investment thesis synthesis

Output as JSON with step_number, description, and tool_hint."""

# Schema-Guided Reasoning: Enforce structure with Pydantic
class PlanOutput(BaseModel):
    """Structured output for the planner."""

    steps: list[PlanStep] = Field(description="Research steps to execute")
    ticker: str = Field(description="The stock ticker being analyzed")

def planner_node(state: AgentState) -> dict:
    """Generate a research plan from the user's request.

    This is Phase 1 of Plan-and-Execute: creating the high-level strategy.
    """

    # Use a powerful model for strategic planning
    llm = ChatAnthropic(model="claude-sonnet-4-5-20250929", temperature=0)

    # Apply Schema-Guided Reasoning to guarantee valid plan structure
    # This prevents common formatting errors that would break execution
    structured_llm = llm.with_structured_output(PlanOutput)

    # Context from long-term memory personalizes the plan
    profile_context = f"""
User Profile:
- Risk Tolerance: {state.user_profile.risk_tolerance}
- Investment Horizon: {state.user_profile.investment_horizon}
"""

    # Single LLM call creates the complete plan
    result: PlanOutput = structured_llm.invoke([
        SystemMessage(content=PLANNER_SYSTEM_PROMPT + profile_context),
        HumanMessage(content=f"Create a research plan for: {last_user_message}"),
    ])

    # State update: Store the plan and initialize tracking
    return {
        "plan": result.steps,           # The sequential steps to execute
        "current_step_index": 0,        # Start at step 0
        "research_data": ResearchData(ticker=result.ticker),  # Initialize data container
    }

✨ Notice the llm.with_structured_output(PlanOutput) line? This applies Schema-Guided Reasoning (SGR)—a pattern I've described in my previous articles. By strictly enforcing the PlanOutput schema, we guarantee the planner always returns a valid list of steps, preventing common formatting errors. LangGraph supports this natively, using these reliable structured outputs to drive deterministic control flow through conditional edges.

Pattern 2: ReAct Execution

Once we have a plan, the executor uses the ReAct pattern to handle one step at a time with full adaptability. This is Phase 2 of Plan-and-Execute: executing each step with the flexibility to react to tool results.

How this reflects the ReAct pattern:

  1. Iterative execution: One step at a time, with observation feedback
  2. Thought-Action-Observation loop: The create_react_agent handles the cycle internally
  3. Context accumulation: Previous step results inform current reasoning
  4. Tool selection: The agent chooses which tools to call based on the step description
  5. Adaptability: Can adjust approach mid-step based on tool results
# Tools available for the ReAct agent to choose from
TOOLS = [
    get_stock_price,
    get_company_metrics,
    get_price_history,
    search_news,
    search_competitors,
]

def executor_node(state: AgentState) -> dict:
    """Execute the current step using a ReAct agent.

    This is Phase 2 of Plan-and-Execute: adaptive execution of each planned step.
    Each step runs as a mini ReAct loop until completion.
    """

    # Get the current step from the plan
    current_step = state.plan[state.current_step_index]

    # Build context from what we've learned so far
    # This is crucial: each step builds on previous observations
    previous_context = ""
    for step in state.plan[:state.current_step_index]:
        if step.result:
            previous_context += f"\nStep {step.step_number}: {step.result}\n"

    # Create a ReAct agent for this step
    # LangGraph's create_react_agent implements the full Thought-Action-Observation loop:
    # 1. Agent generates a "thought" about what tool to call
    # 2. Agent calls the tool ("action")
    # 3. Tool returns result ("observation")
    # 4. Agent decides: call another tool or finish
    react_agent = create_react_agent(
        model=ChatAnthropic(model="claude-sonnet-4-5-20250929"),
        tools=TOOLS,
    )

    # Invoke the ReAct loop for this single step
    # The agent will loop internally until it completes the step
    result = react_agent.invoke({
        "messages": [
            SystemMessage(content=EXECUTOR_SYSTEM_PROMPT),
            HumanMessage(content=f"""Execute Step {current_step.step_number}:
{current_step.description}

Ticker: {state.research_data.ticker}
Previous findings: {previous_context}"""),
        ]
    })

    # Extract the final answer from the ReAct agent's message history
    # The last message contains the synthesis after all tool calls
    updated_plan = list(state.plan)
    updated_plan[state.current_step_index] = PlanStep(
        step_number=current_step.step_number,
        description=current_step.description,
        completed=True,
        result=result["messages"][-1].content,  # Final observation
    )

    # State update: Mark step complete and advance to next
    return {
        "plan": updated_plan,
        "current_step_index": state.current_step_index + 1,
    }

This demonstrates the core Plan-and-Execute flow: use a powerful model to create the plan, then execute each step sequentially using ReAct for maximum adaptability.

Pattern 3: ReWOO for Fast Snapshots

For quick briefings, ReWOO skips the interleaved reasoning and executes everything in parallel. Unlike ReAct, ReWOO creates a "compiled script" of all tool calls upfront and runs them without LLM involvement.

How this reflects the ReWOO pattern:

  1. Three distinct phases: Planner → Worker → Solver (no loops)
  2. Variable placeholders: Tool calls reference #E1, #E2 for future results
  3. No LLM during execution: Worker just runs tools, no reasoning
  4. Parallel execution: Independent tools run concurrently
  5. Single synthesis: Solver makes ONE final LLM call with all data

Phase 1: ReWOO Planner (creates the complete execution graph upfront)

class ReWOOPlanStep(BaseModel):
    """A step in the ReWOO plan with variable placeholders.

    Key difference from Plan-and-Execute's PlanStep:
    - Contains actual tool_name and tool_args (not just description)
    - Uses variable references (#E1) for dependencies
    """
    step_id: str  # e.g., "#E1" - becomes a variable
    description: str
    tool_name: str     # Exact tool to call
    tool_args: dict    # May contain variable refs like {"price": "#E1"}
    depends_on: list[str] = []  # For dependency ordering
    result: str | None = None

class ReWOOPlanOutput(BaseModel):
    """Structured output for ReWOO planner."""
    steps: list[ReWOOPlanStep] = Field(description="Planned tool calls with variables")

def rewoo_planner_node(state: AgentState) -> dict:
    """Generate a complete plan of tool calls upfront.

    This is the key difference from Plan-and-Execute: instead of creating
    human-readable step descriptions, we create EXACT tool calls that
    the worker will execute blindly.
    """

    llm = ChatAnthropic(model="claude-sonnet-4-5-20250929", temperature=0)

    # Schema-Guided Reasoning ensures valid tool call specifications
    structured_llm = llm.with_structured_output(ReWOOPlanOutput)

    ticker = state.research_data.ticker if state.research_data else "UNKNOWN"

    # Single LLM call to plan ALL tool executions
    result: ReWOOPlanOutput = structured_llm.invoke([
        SystemMessage(content=REWOO_PLANNER_PROMPT),
        HumanMessage(content=f"""Create a ReWOO plan for: {query}

Ticker: {ticker}

Output tool calls with:
- step_id: Variable name (#E1, #E2, etc.)
- description: What this accomplishes
- tool_name: Exact tool from the list
- tool_args: Dictionary of arguments
- depends_on: List of step_ids this depends on"""),
    ])

    # State update: Store the complete execution plan
    # Worker will execute this without any LLM involvement
    return {"rewoo_plan": result.steps}

Phase 2: ReWOO Worker (executes tools without LLM reasoning)

def rewoo_worker_node(state: AgentState) -> dict:
    """Execute all planned tools in parallel (no LLM calls).

    This is the key efficiency: Worker is "dumb" - it just runs tools
    according to the plan. No LLM calls = massive token savings.
    """

    results = {}  # Store results keyed by step_id (e.g., "#E1": "$150.23")

    # Execute ALL independent steps in parallel using ThreadPoolExecutor
    # This is where ReWOO gets its speed advantage
    with ThreadPoolExecutor(max_workers=5) as executor:
        futures = {
            executor.submit(execute_tool, step): step
            for step in state.rewoo_plan
            if not step.depends_on  # Only independent tools for parallel batch
        }

        # Collect results as they complete
        for future in as_completed(futures):
            step = futures[future]
            results[step.step_id] = future.result()
            # No LLM reasoning here - just store the raw tool output

    # State update: Store results for the Solver phase
    return {"rewoo_plan": updated_steps}

Phase 3: ReWOO Solver (synthesizes all results in ONE LLM call)

def rewoo_solver_node(state: AgentState) -> dict:
    """Synthesize all tool results into a flash briefing.

    This is the second efficiency gain: Instead of interleaving
    LLM calls with tool execution (like ReAct), we make ONE
    final synthesis call with all gathered data.
    """

    # Build context from ALL tool results at once
    tool_results = []
    for step in state.rewoo_plan:
        if step.result:
            tool_results.append(f"### {step.description}\n{step.result}")

    context = "\n\n".join(tool_results)

    # Single LLM call to synthesize everything
    structured_llm = llm.with_structured_output(FlashBriefingOutput)
    result = structured_llm.invoke([
        SystemMessage(content=REWOO_SOLVER_PROMPT),
        HumanMessage(content=f"Create a flash briefing from this data:\n\n{context}"),
    ])

    return {"draft_report": result}

Key Difference: ReWOO plans all tool calls upfront using placeholders (#E1, #E2), executes them in parallel without LLM intervention, then synthesizes the results in a single final call. This makes it incredibly token-efficient for predictable workflows.

Understanding the Code: How Each Pattern Works Differently

The three patterns differ fundamentally in when and how they invoke the LLM:

Pattern LLM Calls During Execution State Updates Key Code Pattern
Plan-and-Execute 1 for planning + 1 per step Sequential step completion planner_node() → loop: executor_node()reporter_node()
ReAct (within each step) Multiple per step (thought-action cycles) Accumulated message history create_react_agent() loops internally until step complete
ReWOO 1 for planning + 0 during execution + 1 for synthesis Parallel tool completion rewoo_planner_node()rewoo_worker_node()rewoo_solver_node()

What makes each pattern unique in code:

The key difference is what the planner produces - this determines everything that follows:

  1. Plan-and-Execute creates human-readable step descriptions:

    # Planner output (list of PlanStep objects)
    plan = [
        PlanStep(
            step_number=1,
            description="Get current price and key financial metrics",
            tool_hint="get_stock_price"
        ),
        PlanStep(
            step_number=2,
            description="Search for recent news and earnings",
            tool_hint="search_news"
        ),
        # ... more steps
    ]
    

    The executor interprets each description and decides which tools to call. This gives flexibility but requires LLM reasoning per step.

  2. ReAct doesn't have an upfront plan - it uses iterative reasoning:

    # No planning phase - ReAct works step-by-step with accumulated messages
    messages = [
        HumanMessage(content="Execute Step 1: Get current price"),
        AIMessage(content="I'll call get_stock_price"),
        ToolMessage(tool_call_id="1", content="$132.45"),
        AIMessage(content="Now I need metrics..."),
        # ... agent continues until step complete
    ]
    

    The agent makes multiple LLM calls within each step, adapting based on observations. Maximum flexibility, highest token cost.

  3. ReWOO creates explicit, executable tool call specifications:

    # Planner output (list of ReWOOPlanStep objects)
    rewoo_plan = [
        ReWOOPlanStep(
            step_id="#E1",
            tool_name="get_stock_price",
            tool_args={"ticker": "NVDA"}
        ),
        ReWOOPlanStep(
            step_id="#E2",
            tool_name="search_news",
            tool_args={"query": "NVDA earnings", "limit": 5}
        ),
        # ... all tool calls planned upfront
    ]
    

    The worker executes blindly - no LLM involvement. All intelligence is in the planner and solver. Minimum token cost.

Memory and State Flow:

  • Plan-and-Execute: State flows through plan → current_step_index → research_data
  • ReAct: State accumulates in messages array (full conversation history)
  • ReWOO: State flows through rewoo_plan with result fields populated by worker

Putting It All Together: Wiring the Graph

Now that we've seen the three patterns individually, here's how they coexist in a single LangGraph system. The beauty of this architecture is that all three reasoning loops share the same AgentState and live in one graph—the router dynamically chooses which path to follow based on the user's request.

Key insight: You're not building three separate agents. You're building ONE agent with three execution modes.

LangGraph makes this orchestration declarative:

def create_graph(checkpointer=None):
    builder = StateGraph(AgentState)

    # Add nodes
    builder.add_node("router", router_node)
    builder.add_node("planner", planner_node)
    builder.add_node("executor", executor_node)
    builder.add_node("reporter", reporter_node)
    builder.add_node("rewoo_planner", rewoo_planner_node)
    builder.add_node("rewoo_worker", rewoo_worker_node)
    builder.add_node("rewoo_solver", rewoo_solver_node)

    # Define edges
    builder.add_edge(START, "router")
    builder.add_conditional_edges("router", route_after_router, {
        "planner": "planner",
        "rewoo_planner": "rewoo_planner",
    })

    # Deep Research path
    builder.add_edge("planner", "executor")
    builder.add_conditional_edges("executor", route_after_executor, {
        "executor": "executor",  # Loop back for more steps
        "reporter": "reporter",  # Done with plan
    })
    builder.add_edge("reporter", END)

    # Flash Briefing path (ReWOO)
    builder.add_edge("rewoo_planner", "rewoo_worker")
    builder.add_edge("rewoo_worker", "rewoo_solver")
    builder.add_edge("rewoo_solver", END)

    return builder.compile(
        checkpointer=checkpointer,
        interrupt_before=["reporter"],  # HITL pause for approval
    )

Automatic Pattern Selection with a Router

To automatically select the best reasoning loop for each user request, I've added a router classifier. The router uses Schema-Guided Reasoning to reliably classify user intent:

class ExecutionMode(str, Enum):
    """Execution mode for the agent."""

    DEEP_RESEARCH = "deep_research"  # Plan-and-Execute + ReAct (thorough)
    FLASH_BRIEFING = "flash_briefing"  # ReWOO (fast, token-efficient)

class RouterOutput(BaseModel):
    """Structured output for the router."""

    mode: ExecutionMode  # DEEP_RESEARCH or FLASH_BRIEFING
    ticker: str
    reasoning: str

ROUTER_SYSTEM_PROMPT = """Classify the user's request:

1. **deep_research**: Complex analysis requiring synthesis
   - Examples: "Analyze strategic risks", "investment thesis"

2. **flash_briefing**: Quick snapshots, simple data retrieval
   - Examples: "quick snapshot", "current price"

Default to deep_research if unclear."""

structured_llm = llm.with_structured_output(RouterOutput)

This lightweight classification step automatically chooses between Plan-and-Execute (for deep research) and ReWOO (for quick snapshots), removing the burden of manual mode selection from users.

Key Takeaways

  1. ReAct is your default for flexibility but costs tokens and time
  2. ReWOO wins on speed when your tools are reliable and results predictable
  3. Plan-and-Execute delivers the best results for complex analysis
  4. Use a router to choose dynamically—don't force users to pick
  5. State management is critical—LangGraph's checkpointing enables interrupts and recovery

The complete implementation of all three patterns, including the router classifier and state management, is available in the Market Analyst Agent repository.

What's Next

In Part 2, I'll dive into memory architecture—how to give your agent both short-term context (PostgreSQL checkpointing) and long-term knowledge (Qdrant vector memory). I'll show how this enables pause/resume workflows and cross-session learning.

References


The complete Market Analyst Agent code is available on GitHub. Star the repo and follow along as I build the full production stack.

Series: Engineering the Agentic Stack

  • Part 1: The Cognitive Engine (this post)
  • Part 2: Memory Architecture (coming soon)
  • Part 3: Tool Ergonomics and the ACI
  • Part 4: Safety Layers – The Guardian Pattern
  • Part 5: Production Deployment