Skip to content

Medical Agent

This walkthrough covers the Python medical agent example — a LangGraph ReAct agent that searches PubMed for medical literature, fully instrumented with Coalex.

Source: examples/agent-python/medical_agent_example.py


What This Example Demonstrates

  • @retrieval_span capturing custom PubMed search as a retriever span
  • Document dataclass for structured retrieval results
  • coalex_context() scoping each query as a separate trace
  • auto_instrument() capturing all LangChain/LangGraph spans automatically
  • Multi-provider support: Vertex AI (Gemini), OpenAI, Anthropic

Prerequisites

Requirement Version
Python 3.11+
uv latest
Coalex API key ck_live_...
LLM provider credentials See provider setup

Setup

1. Start the local stack

docker compose up -d

2. Set environment variables

export COALEX_API_KEY="ck_live_..."
export COALEX_ENDPOINT="http://localhost:8080"

3. Choose your LLM provider

gcloud auth application-default login
export GOOGLE_CLOUD_PROJECT="your-project"
export COALEX_LLM_PROVIDER=openai
export OPENAI_API_KEY="sk-..."
export COALEX_LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

4. Install and run

cd examples/agent-python
uv pip install -e ".[dev]"
uv run python medical_agent_example.py

Code Walkthrough

Coalex Setup

import coalex
from coalex.ext.retrieval import retrieval_span, Document

coalex.register(
    endpoint=os.environ.get("COALEX_ENDPOINT", "http://localhost:8080"),
    api_key=os.environ["COALEX_API_KEY"],
    service_name="medical-agent",
)
coalex.auto_instrument()
coalex.declare_agent(agent_id="medical-bot", display_name="Medical Bot")

Custom Retrieval Span

The PubMed search is wrapped with @retrieval_span to capture it as a retriever span:

@retrieval_span(name="pubmed_search", query_arg="query")
def search_pubmed(query: str) -> list[Document]:
    # Call PubMed E-utilities API
    results = fetch_pubmed_articles(query, max_results=5)
    return [
        Document(content=r.abstract, id=r.pmid, score=r.relevance)
        for r in results
    ]

This produces a RETRIEVER span with: - The query as input - Retrieved documents with IDs and scores - Document count and timing

Agent Invocation

Each query runs inside a coalex_context:

with coalex.coalex_context(agent_id="medical-bot", request_id=str(uuid.uuid4())):
    docs = search_pubmed("side effects of metformin")
    response = llm.invoke(
        f"Based on these articles: {docs}\n\nQuestion: What are the side effects?"
    )

Captured Trace

Trace: medical-bot / req-<uuid>
  ├── coalex_context (ROOT)
  │   ├── pubmed_search (RETRIEVER)
  │   │     query: "side effects of metformin"
  │   │     documents: 5
  │   └── ChatGoogleGenerativeAI (LLM)
  │         model: gemini-2.0-flash
  │         tokens_in: 1,245, tokens_out: 312
  └── evaluate (internal)

Evaluation Flow

The example also demonstrates the evaluate-resolve loop:

decision = coalex.evaluate(
    request_id=request_id,
    input={"question": "What are the side effects of metformin?"},
    output={"answer": response.content},
    metrics={"answer": ["semantic_similarity", "f1"]},
)

if decision.status == "escalated":
    print(f"Escalated for review: {decision.escalation_id}")

Next Steps