Take control of your team's
AI coding spend
One self-hosted gateway in front of Claude Code, Cursor, and every OpenAI- or Anthropic-compatible tool. Per-seat budgets, model access grants, audit log, per-IDE cost attribution. Your providers, your infrastructure, your data.
30 minutes. Your providers. No slides.
Setup recipes shipped for the coding tools your team uses
- Claude Code
- Cursor
- Continue
- Zed
- aider
- opencode
- qwen-code
- codex
plus any OpenAI- or Anthropic-compatible SDK pointed at the gateway URL
Every control you need around AI coding spend
One closed-source binary on your infrastructure. Your providers, your data, your network.
Access grants
Grant specific models to users or groups. The gateway enforces the policy at request time, so an IDE never sees a model the user is not allowed to call.
Budget caps
Daily, weekly, and monthly USD caps per developer. Hard block when a window is exhausted. Email alerts at 50, 80, and 100 percent.
Rate limits
Cap requests, input tokens, output tokens, and concurrency per user or group. Group policies merge with user policies; the strictest cap wins.
Per-IDE and per-repo attribution
Cost broken out by coding tool and repo as built-in dashboard dimensions, not free-form tags. Filter by Claude Code on a specific project without custom queries.
Multi-provider
OpenAI, Anthropic, Azure, Bedrock, Google Generative AI, Vertex, Vertex Anthropic, Groq, Cohere, DeepSeek, plus any OpenAI-compatible endpoint. One gateway, your keys.
Secret blocking
Opt-in regex scan on prompt content. Catches AWS keys, GitHub tokens, Google API keys, and private-key blocks before they leave your network.
Audit log
Every admin mutation written synchronously. If the audit row cannot be written, the mutation fails. A complete trail for compliance reviews and per-team chargeback.
Runs on your infra
Runs on any cloud or on-prem, wherever you run containers. Three services: a database, the API, and a TLS reverse proxy. Your provider keys, your data, your network.
How it works
Four steps from a fresh Linux box to your first developer running Claude Code through the gateway.
Deploy on your infrastructure
Three small services on any cloud or on-prem: a database, the API, and a TLS reverse proxy. Pre-built container images, automatic Let's Encrypt cert. No outbound dependency, no data leaves your network.
Add your providers
Drop in keys for OpenAI, Anthropic, Azure, Bedrock, Google, Vertex, Groq, Cohere, DeepSeek, or any OpenAI-compatible endpoint. The seed catalog seeds pricing, so spend tracking starts on day one. Override with your negotiated rates anytime.
Grant access and set limits
Invite developers, group them, and grant model access by group. Set rate limits and budget caps per user or per group. Every change is written to the audit log synchronously.
Connect your coding tools
One CLI command wires Claude Code, opencode, qwen-code, and codex through the gateway. Per-seat key, gitignored config, hooks for repo attribution. Cursor and other GUI tools have copy-paste snippets on the same screen.
The developer CLI
One install, one sign-in, one command per coding tool. Idempotent and gitignore-aware.
Configures Claude Code, opencode, qwen-code, and codex. Cursor and other GUI-only tools get manual snippets on the Connect page.
Common questions
What is CodeVector?
CodeVector is a self-hosted AI coding governance platform for engineering teams. It proxies coding tools (Claude Code, Cursor, Continue, Zed, aider, opencode, qwen-code, codex) to LLM providers with per-developer access grants, budget caps, audit logging, secret-pattern blocking, and per-IDE / per-repo cost attribution.
How is CodeVector different from LiteLLM?
LiteLLM is open-source and free at the binary level; the total cost is your engineering team's time to operate and maintain it. CodeVector is closed-source commercial software with a contractual 48-hour provider-drift SLA, shipped setup recipes for eight coding tools, and built-in per-IDE / per-repo cost attribution that you can filter by in the dashboard without custom queries.
Can CodeVector run on our own infrastructure?
Yes. CodeVector is self-hosted by default. It runs on any cloud or on-prem, anywhere you run containers, as a small stack of three services: a database, the API, and a TLS reverse proxy. Provider keys, prompts, and usage data never leave your network.
Which coding tools does CodeVector support?
CodeVector ships setup recipes for Claude Code, Cursor, Continue, Zed, aider, opencode, qwen-code, and codex. Any tool that speaks the OpenAI chat-completions or Anthropic messages protocol works through the gateway, exposed at /gateway/openai/v1/* and /gateway/anthropic/v1/*.
How does CodeVector enforce budgets?
Each developer has daily, weekly, and monthly USD spend caps. The gateway tracks spend in real time. When a window's cap is exceeded, further requests are blocked until the window rolls over. Email alerts fire at 50%, 80%, and 100% thresholds.
Self-hosted, single tenant, your data
Runs on any cloud or on-prem, anywhere you run containers. Your provider keys stay encrypted in your database. No SaaS control plane, no third-party data egress.
48-hour provider-drift SLA
A new model from OpenAI, Anthropic, or Bedrock should not break your team's tools. We commit to a fix in 48 hours, backed by weekly contract tests against every supported provider.
Audit log, secret scanning, envelope-encrypted credentials
Every admin mutation is logged synchronously. Provider keys are wrapped per-row with a data key, then a key-encryption key. Rotation is a documented runbook.
See it on your own providers
A 30-minute walkthrough on your real providers and coding tools. We bring the deployment. You bring an OpenAI or Anthropic key.