Medical Agent¶

This walkthrough covers the Python medical agent example — a LangGraph ReAct agent that searches PubMed for medical literature, fully instrumented with Coalex.

Source: examples/agent-python/medical_agent_example.py

What This Example Demonstrates¶

@retrieval_span capturing custom PubMed search as a retriever span
Document dataclass for structured retrieval results
coalex_context() scoping each query as a separate trace
auto_instrument() capturing all LangChain/LangGraph spans automatically
Multi-provider support: Vertex AI (Gemini), OpenAI, Anthropic

Prerequisites¶

Requirement	Version
Python	3.11+
uv	latest
Coalex API key	`ck_live_...`
LLM provider credentials	See provider setup

Setup¶

1. Start the local stack¶

docker compose up -d

2. Set environment variables¶

export COALEX_API_KEY="ck_live_..."
export COALEX_ENDPOINT="http://localhost:8080"

3. Choose your LLM provider¶

Vertex AI (default)OpenAIAnthropic

gcloud auth application-default login
export GOOGLE_CLOUD_PROJECT="your-project"

export COALEX_LLM_PROVIDER=openai
export OPENAI_API_KEY="sk-..."

export COALEX_LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

4. Install and run¶

cd examples/agent-python
uv pip install -e ".[dev]"
uv run python medical_agent_example.py

Code Walkthrough¶

Coalex Setup¶

import coalex
from coalex.ext.retrieval import retrieval_span, Document

coalex.register(
    endpoint=os.environ.get("COALEX_ENDPOINT", "http://localhost:8080"),
    api_key=os.environ["COALEX_API_KEY"],
    service_name="medical-agent",
)
coalex.auto_instrument()
coalex.declare_agent(agent_id="medical-bot", display_name="Medical Bot")

Custom Retrieval Span¶

The PubMed search is wrapped with @retrieval_span to capture it as a retriever span:

@retrieval_span(name="pubmed_search", query_arg="query")
def search_pubmed(query: str) -> list[Document]:
    # Call PubMed E-utilities API
    results = fetch_pubmed_articles(query, max_results=5)
    return [
        Document(content=r.abstract, id=r.pmid, score=r.relevance)
        for r in results
    ]

This produces a RETRIEVER span with: - The query as input - Retrieved documents with IDs and scores - Document count and timing

Agent Invocation¶

Each query runs inside a coalex_context:

with coalex.coalex_context(agent_id="medical-bot", request_id=str(uuid.uuid4())):
    docs = search_pubmed("side effects of metformin")
    response = llm.invoke(
        f"Based on these articles: {docs}\n\nQuestion: What are the side effects?"
    )

Captured Trace¶

Trace: medical-bot / req-<uuid>
  ├── coalex_context (ROOT)
  │   ├── pubmed_search (RETRIEVER)
  │   │     query: "side effects of metformin"
  │   │     documents: 5
  │   └── ChatGoogleGenerativeAI (LLM)
  │         model: gemini-2.0-flash
  │         tokens_in: 1,245, tokens_out: 312
  └── evaluate (internal)

Evaluation Flow¶

The example also demonstrates the evaluate-resolve loop:

decision = coalex.evaluate(
    request_id=request_id,
    input={"question": "What are the side effects of metformin?"},
    output={"answer": response.content},
    metrics={"answer": ["semantic_similarity", "f1"]},
)

if decision.status == "escalated":
    print(f"Escalated for review: {decision.escalation_id}")

Next Steps¶

TypeScript Agent — Same patterns in TypeScript
LangChain Agent — Simpler LangChain examples
Extension Decorators — All custom span types