Features
From real-time cost tracking to AI-driven optimization — one platform for your entire AI spend.
Analytics
Total cost, requests, latency, and token counts at a glance. Daily cost trend, cost by provider, budget status, and your top API keys by spend — all on one page.
Request and token volume over time, with a full breakdown by model or provider. See requests, input/output tokens, cost, and average latency for every model. Export to CSV or JSON.
AI-driven analysis of your usage patterns. Identifies model overkill, system prompt bloat, and cache opportunities — with estimated savings and one-click apply. Includes cost forecasting with projected vs. previous month comparison.
Full visibility into AI agent costs. See total tasks, total cost, average duration, and cost by agent type. Every task shows its type (Orchestrator, Standalone), request count, tokens, cost, and duration — with drill-down into the full agent tree.
Control
Three-level budget hierarchy: organization, team, and API key. Each level shows budget limit, spend this month, and usage percentage with a progress bar. Edit budgets inline — no separate settings page needed.
Budget threshold alerts scoped to org, team, or individual key. Set custom thresholds, choose notification channels (email, Slack, webhook), and view alert history. Includes circuit breaker rules for automated protection.
Budget-triggered routing rules that kick in automatically as spend increases. Start with best models at 0%, switch to cost-optimized at 60%, force cheapest models at 85%, and block non-critical requests at 95%.
Gateway
Connect your AI providers through the dashboard. Add your API key for OpenAI, Anthropic, Google, or any other supported provider — Wardis handles tracking and cost management for all requests automatically.
Four routing strategies: cost optimized, latency optimized, quality optimized, and custom rules. Provider health monitoring from real proxy traffic. Add custom routing rules with fallback chains.
Reduce costs with intelligent request caching. Exact match and semantic similarity with a configurable threshold. Set TTL, cache scope, and per-key overrides. Dashboard shows hit rate, total hits, and estimated savings.
Searchable pricing table for all supported models. Filter by provider, see input and output cost per million tokens. Pricing data synced from the LiteLLM database.
Workspace
Create, edit, and revoke API keys. Each key belongs to a team and has its own budget limit and usage tracking. Keys are hashed with bcrypt — never stored in plain text.
Organize keys and users into teams. Each team has its own budget, usage tracking, and alert threshold. For SaaS builders — map one team per customer to track costs per customer automatically.
Four roles — Owner, Admin, Developer, Viewer — each with specific permissions. See all members with their role, team assignments, and join date. Invite users by email. Full permission matrix for fine-grained access control.
Track every management action across your organization. Key operations, user changes, team changes — all logged with timestamp, actor, and details. Filter by time range.
Self-host Wardis in minutes. Free and open source.