← Back to blog

Open Source AI Agent Monitoring: Tools and Approaches Compared

By Pat · March 14, 2026 · 8 min read

What to monitor in AI agents

AI agents are not traditional web services. Standard APM tools were not designed for their failure modes. Here is what matters:

Any monitoring tool should cover cost, latency, and traces. Loop detection and budget enforcement separate agent-aware tools from generic observability.

The landscape — open source options

The AI agent monitoring space is young. Most tools launched in 2024 or 2025, and the feature sets are evolving quickly. Here are the four main approaches teams use today:

AgentGuard SDK is an MIT-licensed Python library with zero dependencies. It provides tracing, cost tracking, budget enforcement, loop detection, and a remote kill switch. Traces can go to a local JSONL file or the hosted dashboard. Designed for production safety, not just observability.

Langfuse is an open source LLM observability platform with tracing, prompt management, and a self-hostable web UI. It has the most mature tracing interface of any open source tool, with polished nested span visualization. It does not provide budget enforcement, loop detection, or kill switches.

Arize Phoenix is an open source observability tool focused on tracing, evaluation, and retrieval analysis. Strong on ML-specific metrics like embedding drift. Integrates well with LlamaIndex. Focused on analysis, not runtime enforcement.

Custom logging with stdlib. Python's logging module plus a log aggregator like Elasticsearch or Loki. Maximum flexibility, zero lock-in, but you build everything from scratch.

Feature comparison

This table compares the four approaches across the dimensions that matter most for production agent monitoring. We have tried to be honest about where each tool excels and where it falls short.

Feature AgentGuard SDK Langfuse Arize Phoenix Custom stdlib
Tracing Spans + JSONL Nested spans (best UI) OpenTelemetry spans DIY
Cost tracking Per-run, per-step Per-trace Basic DIY
Budget enforcement BudgetGuard No No DIY
Loop detection LoopGuard No No DIY
Kill switch Remote, real-time No No DIY
Setup complexity 2 lines of code Self-host or cloud pip install + local High
Dependencies Zero Postgres, Redis, etc. Several Python deps Your choice
License MIT MIT (core) Apache 2.0 N/A

The honest takeaway: Langfuse has the best tracing UI. AgentGuard is the only open source SDK with runtime safety (budgets, loops, kill). Phoenix excels at ML-specific analysis. Custom stdlib gives total control at 10x the build effort.

AgentGuard SDK — free, zero dependencies

Install with pip install agentguard47. No database, no containers, no config files. The SDK supports two sinks:

Local-only mode works indefinitely. No time limit, no feature gate, no telemetry. The SDK works identically with or without a network connection.

Getting started with local-only monitoring

Here is a complete example of using AgentGuard for local monitoring. No server, no API key, no sign-up. Just pip install and start tracing.

from agentguard import Tracer, JsonlSink, BudgetGuard, LoopGuard

# All traces written to a local file — no network needed
tracer = Tracer(sink=JsonlSink("traces.jsonl"))

# Optional: add safety guards even in local mode
tracer.add_guard(BudgetGuard(max_dollars=2.0))
tracer.add_guard(LoopGuard(max_repeats=3))

with tracer.trace("local-dev-run") as run:
    # Your agent code here
    result = agent.invoke("Summarize the quarterly report")
    print(f"Cost: ${run.total_cost:.4f}")
    print(f"Steps: {run.step_count}")

# Read traces back with standard tools
# cat traces.jsonl | python -m json.tool
# or load into pandas:
import pandas as pd
df = pd.read_json("traces.jsonl", lines=True)
print(df[["run_id", "total_cost", "duration_ms", "status"]])

Each line in traces.jsonl is a self-contained JSON object with the run ID, timestamps, step details, token counts, cost breakdown, and guard events. You get full observability without any external service.

This is valuable in three scenarios: local development (catch loops before production), CI pipelines (budget guards on integration tests), and air-gapped environments (full tracing with zero data exfiltration risk).

When to upgrade to hosted observability

Local-only monitoring works well for individual developers and small teams. But there are clear inflection points where the hosted dashboard becomes worth the upgrade:

The transition is seamless. You change one line of code, replacing JsonlSink("traces.jsonl") with HttpSink("ag1_your_key"). Everything else, including guards, trace structure, and your agent code, stays identical. There is no migration, no schema change, and no data loss. Your local JSONL files remain on disk as a backup.

Start local. Move to hosted when the team or the stakes grow. That is the design principle behind the two-sink architecture.

Start monitoring in 2 minutes

AgentGuard SDK is MIT-licensed with zero dependencies. Use it locally forever, or connect the dashboard when your team is ready.

Start free trial