This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
MaaS-Base is an open-source GPU cluster manager for AI model deployment. It orchestrates inference engines (vLLM, SGLang, TensorRT-LLM, etc.) across GPU clusters, providing multi-cluster management, load balancing, monitoring, and access control.
Tech stack: Python 3.10–3.12, FastAPI, SQLModel, Pydantic, uv (package manager), hatchling (build), Alembic (migrations), pytest, Higress (API gateway).
gpustack/
├── api/ # REST API layer (auth, middlewares, tenant, OpenAI extensions)
├── client/ # Generated + custom HTTP clients for server/worker communication
├── cloud_providers/ # Cloud provider integrations (DigitalOcean, etc.)
├── cmd/ # CLI subcommands (version, db migration, admin reset, etc.)
├── codegen/ # OpenAPI client code generation
├── config/ # Configuration and registration logic
├── detectors/ # GPU/device detection (fastfetch, runtime, custom)
├── envs/ # Environment variable management
├── exporter/ # Prometheus metrics exporting
├── gateway/ # Higress AI gateway integration (routing, plugins, k8s CRDs)
├── http_proxy/ # Load balancing and proxy strategies
├── k8s/ # Kubernetes manifest templates
├── migrations/ # Alembic database migrations
├── mixins/ # SQLAlchemy mixins (active record, timestamps)
├── policies/ # Scheduling policies (resource fit selectors for various backends)
├── routes/ # HTTP route handlers
├── schemas/ # Database models / SQLModel schemas
├── server/ # Server components (scheduler, controllers, API server)
├── worker/ # Worker components (runtime, serving manager, metric exporter)
├── websocket_proxy/ # WebSocket proxying
├── main.py # Entry point (`gpustack` CLI command)
└── security.py # Security utilities
Key components:
gpustack/migrations/.uv package manager (auto-installed via make install)| Command | Description |
|---|---|
make install |
Install uv, sync dependencies, setup pre-commit hooks |
make deps |
Sync and lock dependencies with uv |
make generate |
Generate code (OpenAPI client, etc.) |
make lint |
Run pre-commit checks (flake8, black, etc.) |
make test |
Run pytest |
make build |
Build wheel package (outputs to dist/) |
make build-docs |
Build documentation (Linux/macOS only) |
make serve-docs |
Serve documentation locally (Linux/macOS only) |
make package |
Build container images (Linux/macOS only) |
make ci |
Full CI pipeline: install → deps → lint → test → build |
# Start in disabled gateway mode for development
uv run gpustack start --database-url postgresql://postgres:mysecretpassword@localhost:5432/postgres --gateway-mode disabled --api-port 80
uv add <package> # runtime dependency
uv add --dev <package> # dev/test dependency
uv run pytest tests/path/to/test_file.py -k test_name
uv for dependency management (not pip directly). pyproject.toml is the source of truth.gpustack/migrations/versions/. Use Alembic for schema changes.hack/windows/*.ps1 scripts, but worker nodes require Linux.gpustack/community-inference-backends repo during make install.