Observability Metrics¶
Coalex automatically captures operational metrics for every trace — token usage, cost estimates, latency, and environmental impact.
Not to be confused with quality metrics
This page covers observability metrics (tokens, cost, latency). For quality metrics used in evaluations (F1, semantic similarity, etc.), see the Metrics Catalog.
Token Metrics¶
Captured automatically by auto-instrumentation for every LLM call.
| Metric | Unit | Description |
|---|---|---|
input_tokens |
count | Prompt/input token count |
output_tokens |
count | Completion/output token count |
total_tokens |
count | Sum of input + output tokens |
Cost Metrics¶
Estimated by the Transformer based on model pricing tables.
| Metric | Unit | Description |
|---|---|---|
cost_total |
USD | Estimated total cost for the LLM call |
cost_per_token |
USD | Average cost per token |
Performance Metrics¶
| Metric | Unit | Description |
|---|---|---|
latency |
ms | End-to-end span duration |
time_to_first_token |
ms | Time from request to first streamed token |
Sustainability Metrics¶
Powered by the ecologits library. Computed by the Transformer for every LLM call.
| Metric | Unit | Description |
|---|---|---|
energy |
kWh | Energy consumption |
gwp |
kgCO2eq | Global Warming Potential (carbon footprint) |
adpe |
kgSbeq | Abiotic Depletion Potential for Elements |
pe |
MJ | Primary Energy consumption |
Viewing Metrics¶
Dashboard¶
The admin dashboard displays metrics on the Agent Detail page:
- Token usage over time
- Cost breakdown by model
- Latency percentiles (P50, P95, P99)
- Sustainability impact summary
API¶
Query metrics for a specific agent:
curl https://your-org.coalex.ai/v1/metrics?agent_id=support-bot \
-H "Authorization: Bearer $COALEX_API_KEY"
SQL (DuckLake)¶
SELECT
metric_type,
metric_id,
AVG(value) as avg_value,
COUNT(*) as samples
FROM metrics
WHERE agent_id = 'support-bot'
AND created_at >= CURRENT_DATE - INTERVAL '7 days'
GROUP BY metric_type, metric_id
ORDER BY metric_type, metric_id;
Best Practices¶
- Monitor cost trends — Set up alerts when daily cost exceeds a threshold.
- Track token efficiency — Compare input/output token ratios across prompt versions.
- Use sustainability data — Report carbon footprint for EU AI Act compliance.
- Benchmark latency — Use P95 latency as your SLA metric, not average.