← Back to blog

How to Set Budget Limits on LangChain Agents (Step-by-Step)

By Pat · March 14, 2026 · 5 min read

LangChain is one of the most popular frameworks for building AI agents. But it has a blind spot: there's no built-in way to set a dollar-based budget limit on an agent run. You can set max_iterations, but that doesn't tell you how much a run will cost — and it won't stop an agent that's burning $0.20 per iteration.

This tutorial shows you how to add hard budget limits, loop detection, and cost tracking to any LangChain agent using AgentGuard. Each step includes complete code you can copy and adapt.

The problem — LangChain agents have no built-in budget controls

LangChain's AgentExecutor provides two knobs for controlling agent behavior: max_iterations and max_execution_time. Both are useful, but neither addresses cost directly.

max_iterations=15 caps the number of tool-call loops, but says nothing about how much each iteration costs. With GPT-4o, a single iteration with a large context window can cost $0.15 or more. Fifteen iterations at that rate is $2.25 — and that's for one user request.

max_execution_time is even less useful for cost control. A fast agent can burn through $5 in 10 seconds. A slow agent might take 60 seconds but cost $0.03. Time and cost are not correlated.

What you actually need is a way to say: "Stop this agent if it has spent more than $1.00 in LLM API costs." That's what AgentGuard provides.

Step 1 — Install agentguard47

AgentGuard's Python SDK is a single package with zero dependencies. Install it alongside LangChain:

# Install the AgentGuard SDK
pip install agentguard47

# You'll also need LangChain and an LLM provider
pip install langchain langchain-openai

That's it. No extra services to run, no Docker containers, no config files.

Step 2 — Set up the AgentGuard callback handler

AgentGuard integrates with LangChain via a callback handler. This handler intercepts every LLM call, tracks token usage, calculates cost, and sends telemetry to your dashboard:

from agentguard47 import AgentGuardHandler, Tracer, HttpSink

# Create a tracer that sends data to AgentGuard
tracer = Tracer(
    sink=HttpSink(api_key="ag47_your_key_here")
)

# Create the LangChain callback handler
handler = AgentGuardHandler(tracer=tracer)

The AgentGuardHandler implements LangChain's BaseCallbackHandler interface. It hooks into on_llm_start, on_llm_end, and on_tool_start events to capture costs and behavior in real time. Every LLM call is logged with its token count, model, latency, and calculated cost.

Step 3 — Add budget enforcement

Now add a BudgetGuard that sets a hard dollar limit on the agent run:

from agentguard47 import BudgetGuard

# Hard stop at $1.00 — agent will gracefully halt if budget is exceeded
budget_guard = BudgetGuard(max_cost_usd=1.00)

# Add it to the handler
handler = AgentGuardHandler(
    tracer=tracer,
    guards=[budget_guard]
)

When the cumulative cost of LLM calls in a single run crosses $1.00, AgentGuard raises a BudgetExceeded exception that LangChain's executor catches gracefully. The agent stops, partial results are preserved, and the event is logged to your dashboard with the reason budget_exceeded.

This is a hard ceiling, not an estimate. AgentGuard tracks actual token usage reported by the LLM provider, not approximations.

Step 4 — Add loop detection

Budget limits catch cost overruns, but loop detection catches them earlier. If your agent calls the same tool 5 times in a row with similar arguments, something is wrong — and you want to stop it before it wastes budget:

from agentguard47 import LoopGuard

# Detect repeated tool calls — stop after 3 identical calls
loop_guard = LoopGuard(max_repeats=3)

handler = AgentGuardHandler(
    tracer=tracer,
    guards=[budget_guard, loop_guard]
)

The LoopGuard watches the sequence of tool calls. If the same tool is called with the same (or nearly identical) arguments more than max_repeats times consecutively, it triggers a graceful stop. This catches the most common failure mode in production agents: an error-retry loop where the agent keeps calling a broken tool hoping for a different result.

Step 5 — Run your agent and view costs

Now wire everything into your LangChain agent and run it:

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate

# Set up LLM and tools as usual
llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [your_search_tool, your_calculator_tool]

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful research assistant."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

# Run with the AgentGuard handler as a callback
result = executor.invoke(
    {"input": "What were NVIDIA's Q4 2025 earnings?"},
    config={"callbacks": [handler]}
)

print(result["output"])

Every LLM call during this run is tracked. Open the AgentGuard dashboard to see a timeline of calls, per-call costs, total run cost, and whether any guards fired. If the budget was exceeded or a loop was detected, you'll see exactly which call triggered the stop.

Complete example

Here's the full code combining all steps into a single copy-pasteable script:

from agentguard47 import (
    AgentGuardHandler, Tracer, HttpSink,
    BudgetGuard, LoopGuard
)
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate

# 1. Configure AgentGuard telemetry
tracer = Tracer(
    sink=HttpSink(api_key="ag47_your_key_here")
)

# 2. Set up guards
budget_guard = BudgetGuard(max_cost_usd=1.00)
loop_guard = LoopGuard(max_repeats=3)

# 3. Create the LangChain callback handler
handler = AgentGuardHandler(
    tracer=tracer,
    guards=[budget_guard, loop_guard]
)

# 4. Set up your LangChain agent as usual
llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [your_search_tool, your_calculator_tool]

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful research assistant."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_openai_tools_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools)

# 5. Run with AgentGuard callbacks
result = executor.invoke(
    {"input": "What were NVIDIA's Q4 2025 earnings?"},
    config={"callbacks": [handler]}
)

print(result["output"])

Next steps

You now have a LangChain agent with hard budget limits, loop detection, and full cost visibility. Here's where to go next:

Budget enforcement is the single most important safety measure for production agents. It takes two minutes to set up and can save you hundreds of dollars on the first bad run it catches.

Add budget limits to your LangChain agents today

AgentGuard gives you dollar-based budget enforcement, loop detection, and cost tracking — all through a standard LangChain callback handler.

Start free trial