Take control of your team's
AI coding spend

One self-hosted gateway in front of Claude Code, Cursor, and every OpenAI- or Anthropic-compatible tool. Per-seat budgets, model access grants, audit log, per-IDE cost attribution. Your providers, your infrastructure, your data.

Read the docs

30 minutes. Your providers. No slides.

CodeVector admin overview: network-wide spend, requests, token consumption charts, and recent admin activity

Setup recipes shipped for the coding tools your team uses

Claude Code
Cursor
Continue
Zed
aider
opencode
qwen-code
codex

plus any OpenAI- or Anthropic-compatible SDK pointed at the gateway URL

Provider integrations

200+

Models in seed catalog

10 min

From zero to first request

Coding tools with shipped setup

Every control you need around AI coding spend

One closed-source binary on your infrastructure. Your providers, your data, your network.

Access grants

Grant specific models to users or groups. The gateway enforces the policy at request time, so an IDE never sees a model the user is not allowed to call.

Budget caps

Daily, weekly, and monthly USD caps per developer. Hard block when a window is exhausted. Email alerts at 50, 80, and 100 percent.

Rate limits

Cap requests, input tokens, output tokens, and concurrency per user or group. Group policies merge with user policies; the strictest cap wins.

Per-IDE and per-repo attribution

Cost broken out by coding tool and repo as built-in dashboard dimensions, not free-form tags. Filter by Claude Code on a specific project without custom queries.

Multi-provider

OpenAI, Anthropic, Azure, Bedrock, Google Generative AI, Vertex, Vertex Anthropic, Groq, Cohere, DeepSeek, plus any OpenAI-compatible endpoint. One gateway, your keys.

Secret blocking

Opt-in regex scan on prompt content. Catches AWS keys, GitHub tokens, Google API keys, and private-key blocks before they leave your network.

Audit log

Every admin mutation written synchronously. If the audit row cannot be written, the mutation fails. A complete trail for compliance reviews and per-team chargeback.

Runs on your infra

Runs on any cloud or on-prem, wherever you run containers. Three services: a database, the API, and a TLS reverse proxy. Your provider keys, your data, your network.

How it works

Four steps from a fresh Linux box to your first developer running Claude Code through the gateway.

Step 1

Deploy on your infrastructure

Three small services on any cloud or on-prem: a database, the API, and a TLS reverse proxy. Pre-built container images, automatic Let's Encrypt cert. No outbound dependency, no data leaves your network.

# Three small container services on your infra: # 1. Pull the prebuilt images # 2. Run the database migration # 3. Start the stack # TLS auto-provisions on first request. # Runs on AWS, GCP, Azure, on-prem, # or any host with a container runtime.

Providers list in the admin console showing OpenAI, Anthropic, Moonshot, and a custom OpenAI-compatible provider, all marked active

Step 2

Add your providers

Drop in keys for OpenAI, Anthropic, Azure, Bedrock, Google, Vertex, Groq, Cohere, DeepSeek, or any OpenAI-compatible endpoint. The seed catalog seeds pricing, so spend tracking starts on day one. Override with your negotiated rates anytime.

Step 3

Grant access and set limits

Invite developers, group them, and grant model access by group. Set rate limits and budget caps per user or per group. Every change is written to the audit log synchronously.

Models list in the admin console showing claude-3-5-haiku, claude-3-5-sonnet, claude-3-7-sonnet, claude-haiku-4-5, and other Claude model facades

Connect page in the gateway console showing CLI install, sign-in, configure, and verify steps with copy buttons

Step 4

Connect your coding tools

One CLI command wires Claude Code, opencode, qwen-code, and codex through the gateway. Per-seat key, gitignored config, hooks for repo attribution. Cursor and other GUI tools have copy-paste snippets on the same screen.

The developer CLI

One install, one sign-in, one command per coding tool. Idempotent and gitignore-aware.

# Install the CLI (Node 20+) pnpm add -g @codevector/cli # Sign in to your gateway codevector auth login --gateway-url https://gateway.your-company.com # Wire every supported tool at once codevector configure --all --scope local # Confirm reachability and config parsing codevector doctor # Check your spend for the last week codevector usage --days 7

Configures Claude Code, opencode, qwen-code, and codex. Cursor and other GUI-only tools get manual snippets on the Connect page.

Common questions

What is CodeVector?

CodeVector is a self-hosted AI coding governance platform for engineering teams. It proxies coding tools (Claude Code, Cursor, Continue, Zed, aider, opencode, qwen-code, codex) to LLM providers with per-developer access grants, budget caps, audit logging, secret-pattern blocking, and per-IDE / per-repo cost attribution.

How is CodeVector different from LiteLLM?

LiteLLM is open-source and free at the binary level; the total cost is your engineering team's time to operate and maintain it. CodeVector is closed-source commercial software with a contractual 48-hour provider-drift SLA, shipped setup recipes for eight coding tools, and built-in per-IDE / per-repo cost attribution that you can filter by in the dashboard without custom queries.

Can CodeVector run on our own infrastructure?

Yes. CodeVector is self-hosted by default. It runs on any cloud or on-prem, anywhere you run containers, as a small stack of three services: a database, the API, and a TLS reverse proxy. Provider keys, prompts, and usage data never leave your network.

Which coding tools does CodeVector support?

CodeVector ships setup recipes for Claude Code, Cursor, Continue, Zed, aider, opencode, qwen-code, and codex. Any tool that speaks the OpenAI chat-completions or Anthropic messages protocol works through the gateway, exposed at /gateway/openai/v1/* and /gateway/anthropic/v1/*.

How does CodeVector enforce budgets?

Each developer has daily, weekly, and monthly USD spend caps. The gateway tracks spend in real time. When a window's cap is exceeded, further requests are blocked until the window rolls over. Email alerts fire at 50%, 80%, and 100% thresholds.

Runs on your infra

Self-hosted, single tenant, your data

Runs on any cloud or on-prem, anywhere you run containers. Your provider keys stay encrypted in your database. No SaaS control plane, no third-party data egress.

Built for procurement

48-hour provider-drift SLA

A new model from OpenAI, Anthropic, or Bedrock should not break your team's tools. We commit to a fix in 48 hours, backed by weekly contract tests against every supported provider.

Compliance-ready

Audit log, secret scanning, envelope-encrypted credentials

Every admin mutation is logged synchronously. Provider keys are wrapped per-row with a data key, then a key-encryption key. Rotation is a documented runbook.

See it on your own providers

A 30-minute walkthrough on your real providers and coding tools. We bring the deployment. You bring an OpenAI or Anthropic key.

Take control of your team's AI coding spend

Every control you need around AI coding spend

Access grants

Budget caps

Rate limits

Per-IDE and per-repo attribution

Multi-provider

Secret blocking

Audit log

Runs on your infra

How it works

Deploy on your infrastructure

Add your providers

Grant access and set limits

Connect your coding tools

The developer CLI

Common questions

Self-hosted, single tenant, your data

48-hour provider-drift SLA

Audit log, secret scanning, envelope-encrypted credentials

See it on your own providers

Take control of your team's
AI coding spend