hai 1 semana · efd04b3d99
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 
															 ## Project Overview
														
 
															-MASS-Base is an open-source GPU cluster manager for AI model deployment. It orchestrates inference engines (vLLM, SGLang, TensorRT-LLM, etc.) across GPU clusters, providing multi-cluster management, load balancing, monitoring, and access control.
														
 
															+MaaS-Base is an open-source GPU cluster manager for AI model deployment. It orchestrates inference engines (vLLM, SGLang, TensorRT-LLM, etc.) across GPU clusters, providing multi-cluster management, load balancing, monitoring, and access control.
														
 
															 **Tech stack:** Python 3.10–3.12, FastAPI, SQLModel, Pydantic, uv (package manager), hatchling (build), Alembic (migrations), pytest, Higress (API gateway).
														
@@ -38,7 +38,7 @@ gpustack/
 
															 **Key components:**
														
 
															 - **Server:** API Server (FastAPI) + Scheduler + Controllers. Handles model instance assignment and resource state management.
														
 
															-- **Worker:** MASS-Base Runtime + Serving Manager + Metric Exporter. Manages model instance lifecycle on GPU nodes.
														
 
															+- **Worker:** MaaS-Base Runtime + Serving Manager + Metric Exporter. Manages model instance lifecycle on GPU nodes.
														
 
															 - **AI Gateway:** Uses Higress for API routing and load balancing.
														
 
															 - **Database:** Embedded PostgreSQL by default; external PostgreSQL/MySQL supported. Alembic for migrations under `gpustack/migrations/`.
														
--- a/docs/api-reference.md
+++ b/docs/api-reference.md
@@ -1,5 +1,5 @@
 
															 # API Reference
														
 
															-MASS-Base provides a built-in Swagger UI. You can access it by navigating to `<gpustack-server-url>/docs` in your browser to view and interact with the APIs.
														
 
															+MaaS-Base provides a built-in Swagger UI. You can access it by navigating to `<gpustack-server-url>/docs` in your browser to view and interact with the APIs.
														
 
															 ![Swagger UI](assets/swagger-ui.png)
														
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -1,6 +1,6 @@
 
															 # Architecture
														
 
															-The diagram below provides a high-level view of the MASS-Base architecture.
														
 
															+The diagram below provides a high-level view of the MaaS-Base architecture.
														
 
															 ![gpustack-v2-architecture](assets/gpustack-v2-architecture.png)
														
@@ -10,7 +10,7 @@ The diagram below details the internal components and their interactions.
 
															 ### Server
														
 
															-The MASS-Base server consists of the following components:
														
 
															+The MaaS-Base server consists of the following components:
														
 
															 - **API Server:** Provides a RESTful interface for clients to interact with the system. It handles authentication and authorization.
														
 
															 - **Scheduler:** Responsible for assigning model instances to workers.
														
@@ -18,24 +18,24 @@ The MASS-Base server consists of the following components:
 
															 ### Worker
														
 
															-The MASS-Base worker consists of the following components:
														
 
															+The MaaS-Base worker consists of the following components:
														
 
															-- **MASS-Base Runtime:** Detects GPU devices and interacts with the container runtime to deploy model instances.
														
 
															+- **MaaS-Base Runtime:** Detects GPU devices and interacts with the container runtime to deploy model instances.
														
 
															 - **Serving Manager:** Manages the lifecycle of model instances on the worker.
														
 
															 - **Metric Exporter:** Exports metrics about the model instances and their performance.
														
 
															 ### AI Gateway
														
 
															-The AI Gateway handles incoming API requests from clients. It routes requests to the appropriate model instances based on the requested model. MASS-Base uses [Higress](https://github.com/alibaba/higress) for API routing and load balancing.
														
 
															+The AI Gateway handles incoming API requests from clients. It routes requests to the appropriate model instances based on the requested model. MaaS-Base uses [Higress](https://github.com/alibaba/higress) for API routing and load balancing.
														
 
															 ### SQL Database
														
 
															-The MASS-Base server connects to a SQL database as the datastore. MASS-Base uses an Embedded PostgreSQL by default, but you can configure it to use an external PostgreSQL or MySQL as well.
														
 
															+The MaaS-Base server connects to a SQL database as the datastore. MaaS-Base uses an Embedded PostgreSQL by default, but you can configure it to use an external PostgreSQL or MySQL as well.
														
 
															 ### Inference Server
														
 
															-Inference servers are the backends that perform the inference tasks. MASS-Base supports [vLLM](https://github.com/vllm-project/vllm), [SGLang](https://github.com/sgl-project/sglang), [Ascend MindIE](https://www.hiascend.com/en/software/mindie) and [VoxBox](https://github.com/gpustack/vox-box) as the built-in inference server. You can also add custom inference backends.
														
 
															+Inference servers are the backends that perform the inference tasks. MaaS-Base supports [vLLM](https://github.com/vllm-project/vllm), [SGLang](https://github.com/sgl-project/sglang), [Ascend MindIE](https://www.hiascend.com/en/software/mindie) and [VoxBox](https://github.com/gpustack/vox-box) as the built-in inference server. You can also add custom inference backends.
														
 
															 ### Ray
														
 
															-[Ray](https://ray.io) is a distributed computing framework that MASS-Base utilizes to run distributed vLLM. MASS-Base bootstraps Ray cluster on-demand to run distributed vLLM across multiple workers.
														
 
															+[Ray](https://ray.io) is a distributed computing framework that MaaS-Base utilizes to run distributed vLLM. MaaS-Base bootstraps Ray cluster on-demand to run distributed vLLM across multiple workers.
														
--- a/docs/cli-reference/reload-config.md
+++ b/docs/cli-reference/reload-config.md
@@ -19,5 +19,5 @@ gpustack reload-config [OPTIONS]
 
															 | `--file` value                      | (empty)                                | Load configuration from a YAML file. Only whitelisted fields are applied. Keys are normalized to snake_case. Values provided via `--set` override those from the file.                                                                                                                                                                         |
														
 
															 | `--list`                            | `False`                                | Show whitelisted fields and values. When present, other options are ignored.                                                                                                                                                                                                                                                                   |
														
 
															 | `--api-key` value                   | (empty)                                | When force-auth-localhost is enabled, provide an API key for server-side authentication as an admin user.                                                                                                                                                                                                                                      |
														
 
															-| `--server-port` value               | `30080`                                | Target port of the MASS-Base API server for applying or listing runtime config. When omitted, defaults to `GPUSTACK_API_PORT` if set, otherwise `30080`.                                                                                                                                                                                        |
														
 
															-| `--worker-port` value               | `10150`                                | Target port of the MASS-Base worker for applying or listing runtime config. When omitted, defaults to `GPUSTACK_WORKER_PORT` if set, otherwise `10150`.                                                                                                                                                                                         |
														
 
															+| `--server-port` value               | `30080`                                | Target port of the MaaS-Base API server for applying or listing runtime config. When omitted, defaults to `GPUSTACK_API_PORT` if set, otherwise `30080`.                                                                                                                                                                                        |
														
 
															+| `--worker-port` value               | `10150`                                | Target port of the MaaS-Base worker for applying or listing runtime config. When omitted, defaults to `GPUSTACK_WORKER_PORT` if set, otherwise `10150`.                                                                                                                                                                                         |
														
--- a/docs/cli-reference/start.md
+++ b/docs/cli-reference/start.md
@@ -5,7 +5,7 @@ hide:
 
															 # gpustack start
														
 
															-Run MASS-Base server or worker.
														
 
															+Run MaaS-Base server or worker.
														
 
															 ```bash
														
 
															 gpustack start [OPTIONS]
														
@@ -44,9 +44,9 @@ gpustack start [OPTIONS]
 
															 | `--huggingface-token` value                 | (empty)                                | User Access Token to authenticate to the Hugging Face Hub. Can also be configured via the `HF_TOKEN` environment variable.            |
														
 
															 | `--bin-dir` value                           | (empty)                                | Directory to store additional binaries, e.g., versioned backend executables.                                                          |
														
 
															 | `--pipx-path` value                         | (empty)                                | Path to the pipx executable, used to install versioned backends.                                                                      |
														
 
															-| `--system-default-container-registry` value | `docker.io`                            | Default container registry for MASS-Base to pull system and inference images.                                                          |
														
 
															-| `--image-name-override` value               | (empty)                                | Override the default image name for the MASS-Base container.                                                                           |
														
 
															-| `--image-repo` value                        | `gpustack/gpustack`                    | Override the default image repository for the MASS-Base container.                                                                     |
														
 
															+| `--system-default-container-registry` value | `docker.io`                            | Default container registry for MaaS-Base to pull system and inference images.                                                          |
														
 
															+| `--image-name-override` value               | (empty)                                | Override the default image name for the MaaS-Base container.                                                                           |
														
 
															+| `--image-repo` value                        | `gpustack/gpustack`                    | Override the default image repository for the MaaS-Base container.                                                                     |
														
 
															 | `--gateway-mode` value                      | `auto`                                 | Gateway running mode. Options: embedded, in-cluster, external, disabled, or auto (default).                                           |
														
 
															 | `--gateway-kubeconfig` value                | (empty)                                | Path to the kubeconfig file for gateway. Only useful for external gateway-mode.                                                       |
														
 
															 | `--gateway-namespace` value                 | `higress-system`                       | The namespace where the gateway component is deployed.                                                                                |
														
@@ -59,13 +59,13 @@ gpustack start [OPTIONS]
 
															 | ------------------------------------------------ | ------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
														
 
															 | `--port` value                                   | `80`                                             | Port to bind the server to.                                                                                                                                                                                                                                                                                                                   |
														
 
															 | `--tls-port` value                               | `443`                                            | Port to bind the TLS server to.                                                                                                                                                                                                                                                                                                               |
														
 
															-| `--api-port` value                               | `30080`                                          | Port to bind the MASS-Base API server to.                                                                                                                                                                                                                                                                                                      |
														
 
															-| `--proxy-port` value                             | `30079`                                          | Port to bind the MASS-Base proxy server to.                                                                                                                                                                                                                                                                                                    |
														
 
															+| `--api-port` value                               | `30080`                                          | Port to bind the MaaS-Base API server to.                                                                                                                                                                                                                                                                                                      |
														
 
															+| `--proxy-port` value                             | `30079`                                          | Port to bind the MaaS-Base proxy server to.                                                                                                                                                                                                                                                                                                    |
														
 
															 | `--database-port` value                          | `5432`                                           | Port of the embedded PostgresSQL database.                                                                                                                                                                                                                                                                                                    |
														
 
															 | `--metrics-port` value                           | `10161`                                          | Port to expose server metrics.                                                                                                                                                                                                                                                                                                                |
														
 
															 | `--disable-metrics`                              | `False`                                          | Disable server metrics.                                                                                                                                                                                                                                                                                                                       |
														
 
															-| `--disable-worker`                               | (empty)                                          | (DEPRECATED) Disable the embedded worker for the MASS-Base server. New installations will not have the embedded worker by default. Use '--enable-worker' to enable the embedded worker if needed. If neither flag is set, for backward compatibility, the embedded worker will be enabled by default for legacy installations prior to v2.0.1. |
														
 
															-| `--enable-worker`                                | `False`                                          | Enable the embedded worker for the MASS-Base server.                                                                                                                                                                                                                                                                                           |
														
 
															+| `--disable-worker`                               | (empty)                                          | (DEPRECATED) Disable the embedded worker for the MaaS-Base server. New installations will not have the embedded worker by default. Use '--enable-worker' to enable the embedded worker if needed. If neither flag is set, for backward compatibility, the embedded worker will be enabled by default for legacy installations prior to v2.0.1. |
														
 
															+| `--enable-worker`                                | `False`                                          | Enable the embedded worker for the MaaS-Base server.                                                                                                                                                                                                                                                                                           |
														
 
															 | `--bootstrap-password` value                     | Auto-generated.                                  | Initial password for the default admin user.                                                                                                                                                                                                                                                                                                  |
														
 
															 | `--database-url` value                           | Embedded PostgreSQL.                             | URL of the database. Supports PostgreSQL 13.0+, and MySQL 8.0.36+. Example: postgresql://user:password@host:port/db_name or mysql://user:password@host:port/db_name                                                                                                                                                                           |
														
 
															 | `--ssl-keyfile` value                            | (empty)                                          | Path to the SSL key file.                                                                                                                                                                                                                                                                                                                     |
														
@@ -129,7 +129,7 @@ gpustack start [OPTIONS]
 
															 | `--enable-hf-xet`                        | `False`                                | [Deprecated] Enable downloading model files using Hugging Face Xet.                                                                                                                             |
														
 
															 | `--worker-ifname` value                  | (empty)                                | Network interface name of the worker node. Auto-detected by default.                                                                                                                            |
														
 
															 | `--proxy-mode` value                     | (empty)                                | Proxy mode for server accessing model instances: direct (server connects directly) or worker (via worker proxy). Default value is direct for embedded worker, and worker for standalone worker. |
														
 
															-| `--benchmark-image-repo` value           | `gpustack/benchmark-runner`            | Override the default benchmark image repo for the MASS-Base benchmark container.                                                                                                                 |
														
 
															+| `--benchmark-image-repo` value           | `gpustack/benchmark-runner`            | Override the default benchmark image repo for the MaaS-Base benchmark container.                                                                                                                 |
														
 
															 | `--benchmark-dir` value                  | `<data-dir>/benchmarks`                | Directory to store benchmark results.                                                                                                                                                           |
														
 
															 | `--benchmark-max-duration-seconds` value | (empty)                                | Max duration for a benchmark before timeout. Disabled when empty.                                                                                                                               |
														
@@ -141,7 +141,7 @@ For environment variables beyond the command-line parameters mentioned above, pl
 
															 ## Config File
														
 
															-You can configure start options using a YAML-format config file when starting MASS-Base server or worker. Here is a complete example:
														
 
															+You can configure start options using a YAML-format config file when starting MaaS-Base server or worker. Here is a complete example:
														
 
															 ```yaml
														
 
															 # Common Options
														
--- a/docs/contributing.md
+++ b/docs/contributing.md
@@ -1,6 +1,6 @@
 
															-# Contributing to MASS-Base
														
 
															+# Contributing to MaaS-Base
														
 
															-Thanks for taking the time to contribute to MASS-Base!
														
 
															+Thanks for taking the time to contribute to MaaS-Base!
														
 
															 Please review and follow the [Code of Conduct](./code-of-conduct.md).
														
@@ -10,7 +10,7 @@ If you find any bugs or are having any trouble, please search the reported issue
 
															 If you can't find anything related to your issue, contact us by filing an issue. To help us diagnose and resolve, please include as much information as possible, including:
														
 
															-- Software: MASS-Base version, installation method, operating system info, etc.
														
 
															+- Software: MaaS-Base version, installation method, operating system info, etc.
														
 
															 - Hardware: Node info, GPU info, etc.
														
 
															 - Steps to reproduce: Provide as much detail on how you got into the reported situation.
														
 
															 - Logs: Please include any relevant logs, such as server logs, worker logs, etc.
														
--- a/docs/deployment/docker-compose.md
+++ b/docs/deployment/docker-compose.md
@@ -1,6 +1,6 @@
 
															 # Docker Compose 部署指南
														
 
															-本文档介绍如何使用 Docker Compose 部署 MASS-Base 平台。Docker Compose 方式适合单机部署、开发测试和小团队使用。
														
 
															+本文档介绍如何使用 Docker Compose 部署 MaaS-Base 平台。Docker Compose 方式适合单机部署、开发测试和小团队使用。
														
 
															 ## 前置要求
														
@@ -41,8 +41,8 @@ docker compose -f docker-compose.server.yaml up -d
 
															 该命令会启动以下两个容器：
														
 
															-- **`mass-base-db`** — PostgreSQL 16 数据库
														
 
															-- **`mass-base-server`** — MASS-Base Server（从 `pack/Dockerfile` 自动构建镜像）
														
 
															+- **`maas-base-db`** — PostgreSQL 16 数据库
														
 
															+- **`maas-base-server`** — MaaS-Base Server（从 `pack/Dockerfile` 自动构建镜像）
														
 
															 ### 3. 验证部署
														
@@ -51,13 +51,13 @@ docker compose -f docker-compose.server.yaml up -d
 
															 docker compose -f docker-compose.server.yaml ps
														
 
															 # 查看 Server 日志
														
 
															-docker compose -f docker-compose.server.yaml logs -f mass-base-server
														
 
															+docker compose -f docker-compose.server.yaml logs -f maas-base-server
														
 
															 ```
														
 
															 ### 4. 获取初始管理员密码
														
 
															 ```bash
														
 
															-docker exec mass-base-server cat /var/lib/mass-base/initial_admin_password
														
 
															+docker exec maas-base-server cat /var/lib/maas-base/initial_admin_password
														
 
															 ```
														
 
															 ### 5. 访问 UI
														
@@ -83,23 +83,23 @@ docker compose -f docker-compose.external-observability.yaml up -d
 
															 该命令会启动以下四个容器：
														
 
															-- **`mass-base-db`** — PostgreSQL 16 数据库
														
 
															-- **`mass-base-server`** — MASS-Base Server
														
 
															-- **`mass-base-prometheus`** — Prometheus 指标采集
														
 
															-- **`mass-base-grafana`** — Grafana 监控面板
														
 
															+- **`maas-base-db`** — PostgreSQL 16 数据库
														
 
															+- **`maas-base-server`** — MaaS-Base Server
														
 
															+- **`maas-base-prometheus`** — Prometheus 指标采集
														
 
															+- **`maas-base-grafana`** — Grafana 监控面板
														
 
															 ### 3. 访问服务
														
 
															 | 服务 | 地址 | 默认凭据 |
														
 
															 |------|------|---------|
														
 
															-| MASS-Base UI | `http://<服务器IP>:80` | admin / 初始密码 |
														
 
															+| MaaS-Base UI | `http://<服务器IP>:80` | admin / 初始密码 |
														
 
															 | Prometheus | `http://<服务器IP>:9090` | 无 |
														
 
															 | Grafana | `http://<服务器IP>:3000` | admin / grafana |
														
 
															 ### 4. 获取初始管理员密码
														
 
															 ```bash
														
 
															-docker exec mass-base-server cat /var/lib/mass-base/initial_admin_password
														
 
															+docker exec maas-base-server cat /var/lib/maas-base/initial_admin_password
														
 
															 ```
														
 
															 ---
														
@@ -196,7 +196,7 @@ docker compose -f docker-compose.server.yaml down -v
 
															 docker compose -f docker-compose.server.yaml logs -f
														
 
															 # 仅查看 Server 日志
														
 
															-docker compose -f docker-compose.server.yaml logs -f mass-base-server
														
 
															+docker compose -f docker-compose.server.yaml logs -f maas-base-server
														
 
															 ```
														
 
															 ---
														
--- a/docs/deployment/kubernetes.md
+++ b/docs/deployment/kubernetes.md
@@ -1,6 +1,6 @@
 
															 # Kubernetes (Helm) 部署指南
														
 
															-本文档介绍如何使用 Helm 在 Kubernetes 集群中部署 MASS-Base 平台。Helm 方式适合生产环境和大规模部署。
														
 
															+本文档介绍如何使用 Helm 在 Kubernetes 集群中部署 MaaS-Base 平台。Helm 方式适合生产环境和大规模部署。
														
 
															 > **注意：** Kubernetes 部署模式下，内嵌 Higress 网关目前为实验性阶段，详见[限制](#限制)部分。
														
@@ -54,7 +54,7 @@ cd maas-base/charts
 
															 ---
														
 
															-## 4. 部署 MASS-Base
														
 
															+## 4. 部署 MaaS-Base
														
 
															 ### 4.1 默认部署（内嵌 Higress 网关）
														
@@ -102,7 +102,7 @@ helm install higress higress.io/higress-core -n higress-system --create-namespac
 
															 ---
														
 
															-## 5. 访问 MASS-Base
														
 
															+## 5. 访问 MaaS-Base
														
 
															 ### 5.1 获取初始管理员密码
														
@@ -184,7 +184,7 @@ helm install -n gpustack-system gpustack ./gpustack --create-namespace \
 
															 ```bash
														
 
															 helm install -n gpustack-system gpustack ./gpustack --create-namespace \
														
 
															-  --set image.repository=my-registry/mass-base \
														
 
															+  --set image.repository=my-registry/maas-base \
														
 
															   --set image.tag=v2.2.0 \
														
 
															   --set image.pullPolicy=Always
														
 
															 ```
														
--- a/docs/deployment/worker.md
+++ b/docs/deployment/worker.md
@@ -1,10 +1,10 @@
 
															 # Worker 节点部署指南
														
 
															-本文档介绍如何部署 MASS-Base Worker 节点，为平台提供 GPU 推理能力。
														
 
															+本文档介绍如何部署 MaaS-Base Worker 节点，为平台提供 GPU 推理能力。
														
 
															 ## 概述
														
 
															-Worker 是 MASS-Base 的实际推理执行单元，负责：
														
 
															+Worker 是 MaaS-Base 的实际推理执行单元，负责：
														
 
															 - 检测 GPU 设备并上报资源信息
														
 
															 - 管理模型实例的生命周期（启动、停止、重启）
														
@@ -48,13 +48,13 @@ Worker 必须运行在 **Linux 节点**上，且该节点需配备 GPU/NPU 等
 
															 ```bash
														
 
															 # 如果 Server 是通过 Docker Compose 部署的
														
 
															-docker exec mass-base-server cat /var/lib/gpustack/registration_token
														
 
															+docker exec maas-base-server cat /var/lib/gpustack/registration_token
														
 
															 ```
														
 
															 ### 2. 在 Worker 节点上启动
														
 
															 ```bash
														
 
															-docker run -d --name mass-base-worker \
														
 
															+docker run -d --name maas-base-worker \
														
 
															     --restart unless-stopped \
														
 
															     --privileged \
														
 
															     --network host \
														
@@ -84,7 +84,7 @@ docker run -d --name mass-base-worker \
 
															 ```bash
														
 
															 # 查看 Worker 日志
														
 
															-docker logs -f mass-base-worker
														
 
															+docker logs -f maas-base-worker
														
 
															 # 在 Server UI 中查看节点是否上线
														
 
															 # 访问 http://<SERVER_IP> -> Clusters 页面
														
@@ -97,18 +97,18 @@ docker logs -f mass-base-worker
 
															 如果单机部署且节点本身有 GPU，可以直接在 Server 容器中启用 Worker 模式：
														
 
															 ```bash
														
 
															-docker run -d --name mass-base \
														
 
															+docker run -d --name maas-base \
														
 
															     --restart unless-stopped \
														
 
															     --privileged \
														
 
															     --network host \
														
 
															     --ipc host \
														
 
															     -v /var/run/docker.sock:/var/run/docker.sock \
														
 
															     -v /var/run/cdi:/var/run/cdi \
														
 
															-    -v mass-base-data:/var/lib/mass-base \
														
 
															+    -v maas-base-data:/var/lib/maas-base \
														
 
															     -v /var/lib/kubelet/device-plugins:/var/lib/kubelet/device-plugins \
														
 
															     -e NVIDIA_VISIBLE_DEVICES=all \
														
 
															     -e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
														
 
															-    mass-base/mass-base \
														
 
															+    maas-base/maas-base \
														
 
															     gpustack start \
														
 
															     --gateway-mode disabled \
														
 
															     --api-port 80
														
@@ -123,9 +123,9 @@ docker run -d --name mass-base \
 
															 可在 `docker-compose` 文件中追加 Worker 服务：
														
 
															 ```yaml
														
 
															-  mass-base-worker:
														
 
															+  maas-base-worker:
														
 
															     image: gpustack/gpustack:latest
														
 
															-    container_name: mass-base-worker
														
 
															+    container_name: maas-base-worker
														
 
															     restart: unless-stopped
														
 
															     privileged: true
														
 
															     network_mode: host
														
--- a/docs/environment-variables.md
+++ b/docs/environment-variables.md
@@ -1,6 +1,6 @@
 
															 # Environment Variables
														
 
															-MASS-Base supports various environment variables for configuration.
														
 
															+MaaS-Base supports various environment variables for configuration.
														
 
															 Most command line parameters can also be set via environment variables with the `GPUSTACK_` prefix and in uppercase format (e.g., `--data-dir` can be set via `GPUSTACK_DATA_DIR`).
														
@@ -17,14 +17,14 @@ Configuration values are applied in the following priority order (highest to low
 
															 This means that command line arguments will always override environment variables, and environment variables will override values in the configuration file.
														
 
															-## MASS-Base Core Environment Variables
														
 
															+## MaaS-Base Core Environment Variables
														
 
															 These environment variables are typically used for third-party service integrations.
														
 
															 The **Applies to** column indicates where the environment variable should be set:
														
 
															-- **Server** - Applies to the MASS-Base server.
														
 
															-- **Worker** - Applies to MASS-Base workers.
														
 
															+- **Server** - Applies to the MaaS-Base server.
														
 
															+- **Worker** - Applies to MaaS-Base workers.
														
 
															 - **Model** - Applies to model deployment configurations.
														
 
															 ### Proxy Configuration
														
@@ -76,8 +76,8 @@ The **Applies to** column indicates where the environment variable should be set
 
															 | Variable                                              | Description                                                                                                                        | Default                              | Applies to |
														
 
															 | ----------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------ | ---------- |
														
 
															 | `GPUSTACK_HIGRESS_EXT_AUTH_TIMEOUT_MS`                | Higress external authentication timeout in milliseconds.                                                                           | `30000`                              | Server     |
														
 
															-| `GPUSTACK_GATEWAY_PORT_CHECK_INTERVAL`                | The interval in seconds of MASS-Base Server checking embedded gateway listening port                                                | `2`                                  | Server     |
														
 
															-| `GPUSTACK_GATEWAY_PORT_CHECK_RETRY_COUNT`             | The retry count of MASS-Base Server checking embedded gateway listening port                                                        | `300`                                | Server     |
														
 
															+| `GPUSTACK_GATEWAY_PORT_CHECK_INTERVAL`                | The interval in seconds of MaaS-Base Server checking embedded gateway listening port                                                | `2`                                  | Server     |
														
 
															+| `GPUSTACK_GATEWAY_PORT_CHECK_RETRY_COUNT`             | The retry count of MaaS-Base Server checking embedded gateway listening port                                                        | `300`                                | Server     |
														
 
															 | `GPUSTACK_GATEWAY_AI_STATISTICS_PLUGIN_CONTENT_TYPES` | Comma-separated list of content-types to be monitored by the ai-statistics plugin. Each value should be a valid HTTP Content-Type. | `application/json,text/event-stream` | Server     |
														
 
															 ### Usage Tracking Configuration
														
@@ -135,7 +135,7 @@ The **Applies to** column indicates where the environment variable should be set
 
															 !!! note
														
 
															-    These environment variables are **not** set when starting MASS-Base. Instead, they should be configured in the **Advanced Options > Environment Variables** section when deploying a model. They are used to customize the model serving behavior.
														
 
															+    These environment variables are **not** set when starting MaaS-Base. Instead, they should be configured in the **Advanced Options > Environment Variables** section when deploying a model. They are used to customize the model serving behavior.
														
 
															 | <div style="width:180px">Variable</div>                   | Description                                                                                                                                                                                    | Default | Applies to |
														
 
															 | --------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- | ---------- |
														
@@ -150,7 +150,7 @@ The **Applies to** column indicates where the environment variable should be set
 
															 | `GPUSTACK_MODEL_RUNTIME_UID`                              | Control the user permissions of processes running inside the container.                                                                                                                        | (empty) | Model      |
														
 
															 | `GPUSTACK_MODEL_RUNTIME_GID`                              | Control the group permissions of processes running inside the container.                                                                                                                       | (empty) | Model      |
														
 
															 | `GPUSTACK_MODEL_RUNTIME_SHM_SIZE_GIB`                     | Shared memory size for the container in GiB.                                                                                                                                                   | `10.0`  | Model      |
														
 
															-| `GPUSTACK_MODEL_INFERENCE_HEALTH_CHECK_ENABLED`           | Enable inference health check for this model. When enabled, MASS-Base periodically sends minimal inference requests to verify the model is responding correctly.                                | `false` | Model      |
														
 
															+| `GPUSTACK_MODEL_INFERENCE_HEALTH_CHECK_ENABLED`           | Enable inference health check for this model. When enabled, MaaS-Base periodically sends minimal inference requests to verify the model is responding correctly.                                | `false` | Model      |
														
 
															 | `GPUSTACK_MODEL_INFERENCE_HEALTH_CHECK_INTERVAL`          | Inference health check interval in seconds (minimum `60`). If recent successful inference traffic is observed within this interval, the active check is skipped.                               | `300`   | Model      |
														
 
															 | `GPUSTACK_MODEL_INFERENCE_HEALTH_CHECK_TIMEOUT`           | Timeout in seconds for each inference health check request.                                                                                                                                    | `15`    | Model      |
														
 
															 | `GPUSTACK_MODEL_INFERENCE_HEALTH_CHECK_FAILURE_THRESHOLD` | Number of consecutive inference health check failures before marking the instance as unhealthy.                                                                                                | `3`     | Model      |
														
@@ -173,9 +173,9 @@ The serving command script automatically handles:
 
															 - Supporting both `uv pip` and `pip` for package installation
														
 
															 - Handling custom PyPI indices via `PIP_INDEX_URL` and `PIP_EXTRA_INDEX_URL`
														
 
															-## MASS-Base Runtime Environment Variables
														
 
															+## MaaS-Base Runtime Environment Variables
														
 
															-These environment variables are used by MASS-Base runtime. Commonly used to adjust the behavior of inference backends running in Docker/Kubernetes.
														
 
															+These environment variables are used by MaaS-Base runtime. Commonly used to adjust the behavior of inference backends running in Docker/Kubernetes.
														
 
															 They are only usable within workers. Please set the environment variables in the workers’ containers to ensure they take effect properly.
														
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -4,7 +4,7 @@
 
															 ### Hybrid Cluster Support
														
 
															-MASS-Base supports heterogeneous clusters spanning NVIDIA, AMD, Ascend NPUs, Hygon DCUs, Moore Threads, Iluvatar, MetaX, Cambricon and T-head PPUs, and works across both AMD64 and ARM64 architectures.
														
 
															+MaaS-Base supports heterogeneous clusters spanning NVIDIA, AMD, Ascend NPUs, Hygon DCUs, Moore Threads, Iluvatar, MetaX, Cambricon and T-head PPUs, and works across both AMD64 and ARM64 architectures.
														
 
															 ### Distributed Inference Support
														
@@ -34,7 +34,7 @@ MASS-Base supports heterogeneous clusters spanning NVIDIA, AMD, Ascend NPUs, Hyg
 
															 ### How can I change the registered worker name?
														
 
															-You can set it to a custom name using the `--worker-name` flag when running MASS-Base:
														
 
															+You can set it to a custom name using the `--worker-name` flag when running MaaS-Base:
														
 
															 ```diff
														
 
															 sudo docker run -d --name gpustack \
														
@@ -45,7 +45,7 @@ sudo docker run -d --name gpustack \
 
															 ### How can I change the registered worker IP?
														
 
															-You can set it to a custom IP using the `--worker-ip` flag when running MASS-Base:
														
 
															+You can set it to a custom IP using the `--worker-ip` flag when running MaaS-Base:
														
 
															 ```diff
														
 
															 sudo docker run -d --name gpustack \
														
@@ -54,9 +54,9 @@ sudo docker run -d --name gpustack \
 
															 +    --worker-ip xx.xx.xx.xx
														
 
															 ```
														
 
															-### Where are MASS-Base's data stored?
														
 
															+### Where are MaaS-Base's data stored?
														
 
															-When running the MASS-Base container, the Docker volume is mounted using `--volume/-v` parameter. The default data path is under the Docker data directory, specifically in the volumes subdirectory, and the default path is:
														
 
															+When running the MaaS-Base container, the Docker volume is mounted using `--volume/-v` parameter. The default data path is under the Docker data directory, specifically in the volumes subdirectory, and the default path is:
														
 
															 ```bash
														
 
															 /var/lib/docker/volumes/gpustack-data/_data
														
@@ -83,7 +83,7 @@ sudo docker run -d --name gpustack \
 
															 ### Where are model files stored?
														
 
															-When running the MASS-Base container, the Docker volume is mounted using `--volume/-v` parameter. The default cache path is under the Docker data directory, specifically in the volumes subdirectory, and the default path is:
														
 
															+When running the MaaS-Base container, the Docker volume is mounted using `--volume/-v` parameter. The default cache path is under the Docker data directory, specifically in the volumes subdirectory, and the default path is:
														
 
															 ```bash
														
 
															 /var/lib/docker/volumes/gpustack-data/_data/cache
														
@@ -164,7 +164,7 @@ If the allocatable GPU memory is less than 90%, but you are sure the model can r
 
															 **Note**: If the model encounters an error after running and the logs show `CUDA: out of memory`, it means the allocated GPU memory is insufficient. You will need to further adjust `--gpu-memory-utilization`, add more resources, or deploy a smaller model.
														
 
															-The context size for the model also affects the required GPU memory. You can adjust the `--max-model-len` parameter to set a smaller context. In MASS-Base, if this parameter is not set, its default value is 8192. If it is specified in the backend parameters, the actual setting will take effect.
														
 
															+The context size for the model also affects the required GPU memory. You can adjust the `--max-model-len` parameter to set a smaller context. In MaaS-Base, if this parameter is not set, its default value is 8192. If it is specified in the backend parameters, the actual setting will take effect.
														
 
															 You can adjust it to a smaller context as needed, for example, `--max-model-len=2048`. However, keep in mind that the max tokens for each inference request cannot exceed the value of `--max-model-len`. Therefore, setting a very small context may cause inference truncation.
														
@@ -176,7 +176,7 @@ The `--enforce-eager` parameter also helps reduce GPU memory usage. However, thi
 
															 ### What should I do if the model is stuck in `Scheduled` state?
														
 
															-Try restarting the MASS-Base container where the model is scheduled. If the issue persists, check the worker logs [here](troubleshooting.md#view-gpustack-logs) to analyze the cause.
														
 
															+Try restarting the MaaS-Base container where the model is scheduled. If the issue persists, check the worker logs [here](troubleshooting.md#view-gpustack-logs) to analyze the cause.
														
 
															 ### What should I do if the model is stuck in `Error` state?
														
@@ -214,13 +214,13 @@ This is a limitation of vLLM. You can adjust the `--limit-mm-per-prompt` paramet
 
															 ---
														
 
															-## Managing MASS-Base
														
 
															+## Managing MaaS-Base
														
 
															-### How do I use MASS-Base behind a proxy?
														
 
															+### How do I use MaaS-Base behind a proxy?
														
 
															-We recommend passing standard proxy environment variables when running MASS-Base.
														
 
															+We recommend passing standard proxy environment variables when running MaaS-Base.
														
 
															-The following case demonstrates how to configure MASS-Base to forward all requests to the target proxy, except for requests to addresses specified in the NO_PROXY environment variable.
														
 
															+The following case demonstrates how to configure MaaS-Base to forward all requests to the target proxy, except for requests to addresses specified in the NO_PROXY environment variable.
														
 
															 ```bash
														
 
															 docker run -d --name gpustack \
														
--- a/docs/installation/air-gapped.md
+++ b/docs/installation/air-gapped.md
@@ -1,6 +1,6 @@
 
															 # Air-Gapped Installation
														
 
															-MASS-Base can be installed in an air-gapped (offline) environment with no internet access.
														
 
															+MaaS-Base can be installed in an air-gapped (offline) environment with no internet access.
														
 
															 ## Prerequisites
														
@@ -18,7 +18,7 @@ If your system supports a container toolkit, install and configure it as needed
 
															 ### Container Images
														
 
															-MASS-Base offers an [Image Selector](https://docs.gpustack.ai/latest/image-selector/) site to help users easily pick the images they want to download. For more advanced or automated syncing, MASS-Base also provides image management commands:
														
 
															+MaaS-Base offers an [Image Selector](https://docs.gpustack.ai/latest/image-selector/) site to help users easily pick the images they want to download. For more advanced or automated syncing, MaaS-Base also provides image management commands:
														
 
															 - `gpustack copy-images`: Sync images from one registry to another
														
 
															 - `gpustack save-images`: Download images and save them locally
														
@@ -29,9 +29,9 @@ Below are the details on how to use these CLI commands.
 
															 - **Copy Images**
														
 
															-MASS-Base provides various container images for different components and inference backends, available on [Docker Hub](https://hub.docker.com/u/gpustack) and [Quay.io](https://quay.io/user/gpustack/).
														
 
															+MaaS-Base provides various container images for different components and inference backends, available on [Docker Hub](https://hub.docker.com/u/gpustack) and [Quay.io](https://quay.io/user/gpustack/).
														
 
															-To transfer the required container images to your internal registry from a machine with internet access, use the MASS-Base `copy-images` command:
														
 
															+To transfer the required container images to your internal registry from a machine with internet access, use the MaaS-Base `copy-images` command:
														
 
															 ```bash
														
 
															 sudo docker run --rm -it --entrypoint "" gpustack/gpustack \
														
@@ -78,7 +78,7 @@ The displayed image list includes all supported accelerators, inference backends
 
															 If your target environment is air-gapped or does not have internet access, you can first download the required images on a machine with internet connectivity, then transfer and load them into the offline environment.
														
 
															-MASS-Base provides the `save-images` and `load-images` commands for this workflow.
														
 
															+MaaS-Base provides the `save-images` and `load-images` commands for this workflow.
														
 
															 **Copy Images**
														
@@ -126,7 +126,7 @@ sudo docker run --rm -it --entrypoint "" \
 
															     /gpustack-air-gapped
														
 
															 ```
														
 
															-This command imports all image packages from the specified directory into the local Docker daemon, making them available for MASS-Base.
														
 
															+This command imports all image packages from the specified directory into the local Docker daemon, making them available for MaaS-Base.
														
 
															 !!! note
														
@@ -136,7 +136,7 @@ For more details on `load-images`, see the [CLI Reference](../cli-reference/load
 
															 ## Installation
														
 
															-After preparing the internal container registry with the required images, you can install MASS-Base in the air-gapped environment.
														
 
															+After preparing the internal container registry with the required images, you can install MaaS-Base in the air-gapped environment.
														
 
															 ```diff
														
 
															  sudo docker run -d --name gpustack \
														
@@ -152,7 +152,7 @@ After preparing the internal container registry with the required images, you ca
 
															 ### Pulling Inference Backend Images from a Secure Registry
														
 
															 If your internal container registry requires authentication,  
														
 
															-set the following environment variables when starting the MASS-Base worker to allow it to pull the runner image.
														
 
															+set the following environment variables when starting the MaaS-Base worker to allow it to pull the runner image.
														
 
															 ```diff
														
 
															  sudo docker run -d --name gpustack \
														
@@ -167,7 +167,7 @@ set the following environment variables when starting the MASS-Base worker to al
 
															 ### Pulling Inference Backend Images from non-default Namespace
														
 
															 If your internal container registry uses a different namespace than the default `gpustack`,  
														
 
															-set the following environment variable when starting the MASS-Base worker to allow it to pull the runner image.
														
 
															+set the following environment variable when starting the MaaS-Base worker to allow it to pull the runner image.
														
 
															 ```diff
														
 
															  sudo docker run -d --name gpustack \
														
--- a/docs/installation/installation.md
+++ b/docs/installation/installation.md
@@ -2,19 +2,19 @@
 
															 ## Prerequisites
														
 
															-**MASS-Base server:**
														
 
															+**MaaS-Base server:**
														
 
															 - [Docker](https://docs.docker.com/engine/install/) must be installed. Docker Desktop (Windows and macOS) is also supported.
														
 
															-**MASS-Base workers:**
														
 
															+**MaaS-Base workers:**
														
 
															 - [Docker](https://docs.docker.com/engine/install/) must be installed. Docker Desktop is **not** supported.
														
 
															-- Only Linux is supported for MASS-Base worker nodes. If you use Windows, consider using WSL2 and avoid using Docker Desktop. macOS is not supported for MASS-Base worker nodes.
														
 
															+- Only Linux is supported for MaaS-Base worker nodes. If you use Windows, consider using WSL2 and avoid using Docker Desktop. macOS is not supported for MaaS-Base worker nodes.
														
 
															 - Ensure the appropriate GPU drivers and container toolkits are installed for your hardware. See the [Installation Requirements](./requirements.md) for details.
														
 
															-## Install MASS-Base Server
														
 
															+## Install MaaS-Base Server
														
 
															-Run the following command to install and start the MASS-Base server using Docker:
														
 
															+Run the following command to install and start the MaaS-Base server using Docker:
														
 
															 ```bash
														
 
															 sudo docker run -d --name gpustack \
														
@@ -26,17 +26,17 @@ sudo docker run -d --name gpustack \
 
															 !!! note
														
 
															-    MASS-Base v2 uses a single unified container image for all GPU device types.
														
 
															+    MaaS-Base v2 uses a single unified container image for all GPU device types.
														
 
															 ## Startup
														
 
															-Check the MASS-Base container logs:
														
 
															+Check the MaaS-Base container logs:
														
 
															 ```bash
														
 
															 sudo docker logs -f gpustack
														
 
															 ```
														
 
															-If everything is normal, open `http://your_host_ip` in a browser to access the MASS-Base UI.
														
 
															+If everything is normal, open `http://your_host_ip` in a browser to access the MaaS-Base UI.
														
 
															 Log in with username `admin` and the default password. Retrieve the initial password with:
														
@@ -51,7 +51,7 @@ Please follow the UI instructions on the `Clusters` and `Workers` pages to add G
 
															 ## Custom Configuration
														
 
															-The following sections describe examples of custom configuration options when starting the MASS-Base server container. For a full list of available options, refer to the [CLI Reference](../cli-reference/start.md).
														
 
															+The following sections describe examples of custom configuration options when starting the MaaS-Base server container. For a full list of available options, refer to the [CLI Reference](../cli-reference/start.md).
														
 
															 ### Enable HTTPS with Custom Certificate
														
@@ -71,7 +71,7 @@ The following sections describe examples of custom configuration options when st
 
															 ### Using an External Database
														
 
															-By default, MASS-Base uses an embedded PostgreSQL database. To use an external database such as PostgreSQL or MySQL, set the `GPUSTACK_DATABASE_URL` environment variable or use the `--database-url` argument when starting the MASS-Base container:
														
 
															+By default, MaaS-Base uses an embedded PostgreSQL database. To use an external database such as PostgreSQL or MySQL, set the `GPUSTACK_DATABASE_URL` environment variable or use the `--database-url` argument when starting the MaaS-Base container:
														
 
															 ```diff
														
 
															  sudo docker run -d --name gpustack \
														
@@ -96,7 +96,7 @@ sudo docker run -d --name gpustack \
 
															 ### Additional Trusted CAs
														
 
															-If MASS-Base needs to communicate with services that use certificates issued by a private or corporate CA (e.g., a self-hosted Identity Provider, a Hugging Face mirror, or an internal API endpoint), mount the CA certificate into the container under `/usr/local/share/ca-certificates/`. MASS-Base will automatically import the mounted CA certificates during startup and add them to the system trust store.
														
 
															+If MaaS-Base needs to communicate with services that use certificates issued by a private or corporate CA (e.g., a self-hosted Identity Provider, a Hugging Face mirror, or an internal API endpoint), mount the CA certificate into the container under `/usr/local/share/ca-certificates/`. MaaS-Base will automatically import the mounted CA certificates during startup and add them to the system trust store.
														
 
															 ```diff
														
 
															  sudo docker run -d --name gpustack \
														
@@ -137,13 +137,13 @@ git clone -b "$LATEST_TAG" https://github.com/gpustack/gpustack.git
 
															 cd gpustack/docker-compose
														
 
															 ```
														
 
															-Start the MASS-Base server:
														
 
															+Start the MaaS-Base server:
														
 
															 ```bash
														
 
															 sudo docker compose -f docker-compose.server.yaml up -d
														
 
															 ```
														
 
															-If everything is normal, open `http://your_host_ip` in a browser to access the MASS-Base UI.
														
 
															+If everything is normal, open `http://your_host_ip` in a browser to access the MaaS-Base UI.
														
 
															 Log in with username `admin` and the default password. Retrieve the initial password with:
														
--- a/docs/installation/requirements.md
+++ b/docs/installation/requirements.md
@@ -1,10 +1,10 @@
 
															 # Installation Requirements
														
 
															-This page outlines the software and networking requirements for nodes running MASS-Base.
														
 
															+This page outlines the software and networking requirements for nodes running MaaS-Base.
														
 
															 ## Operating System Requirements
														
 
															-MASS-Base supports most modern Linux distributions on **AMD64** and **ARM64** architectures.
														
 
															+MaaS-Base supports most modern Linux distributions on **AMD64** and **ARM64** architectures.
														
 
															 !!! note
														
@@ -13,7 +13,7 @@ MASS-Base supports most modern Linux distributions on **AMD64** and **ARM64** ar
 
															 ## Accelerator Runtime Requirements
														
 
															-MASS-Base supports a variety of General-Purpose Accelerators as inference backends, including:
														
 
															+MaaS-Base supports a variety of General-Purpose Accelerators as inference backends, including:
														
 
															 - [x] NVIDIA GPU
														
 
															 - [x] AMD GPU
														
@@ -25,7 +25,7 @@ MASS-Base supports a variety of General-Purpose Accelerators as inference backen
 
															 - [x] Cambricon MLU (Experimental)
														
 
															 - [x] T-Head PPU (Experimental)
														
 
															-Ensure all required drivers and toolkits are installed before running MASS-Base.
														
 
															+Ensure all required drivers and toolkits are installed before running MaaS-Base.
														
 
															 ### NVIDIA GPU
														
@@ -235,7 +235,7 @@ sudo ppu-smi
 
															 ### Connectivity Requirements
														
 
															-The following network connectivity is required for MASS-Base to function properly:
														
 
															+The following network connectivity is required for MaaS-Base to function properly:
														
 
															 **Server-to-Worker:** The server must be able to reach workers to proxy inference requests.
														
@@ -245,23 +245,23 @@ The following network connectivity is required for MASS-Base to function properl
 
															 ### Port Requirements
														
 
															-MASS-Base uses these ports for communication:
														
 
															+MaaS-Base uses these ports for communication:
														
 
															 #### Server Ports
														
 
															 | Port      | Description                                                  |
														
 
															 | --------- | ------------------------------------------------------------ |
														
 
															-| TCP 80    | Default port for MASS-Base UI and API endpoints               |
														
 
															-| TCP 443   | Default port for MASS-Base UI and API endpoints (TLS enabled) |
														
 
															+| TCP 80    | Default port for MaaS-Base UI and API endpoints               |
														
 
															+| TCP 443   | Default port for MaaS-Base UI and API endpoints (TLS enabled) |
														
 
															 | TCP 10161 | Default port for server metrics endpoint                     |
														
 
															-| TCP 30080 | Default port for MASS-Base server internal API                |
														
 
															+| TCP 30080 | Default port for MaaS-Base server internal API                |
														
 
															 | TCP 5432  | Default port for embedded Postgres Database                  |
														
 
															 #### Worker Ports
														
 
															 | Port            | Description                                                    |
														
 
															 | --------------- | -------------------------------------------------------------- |
														
 
															-| TCP 10150       | Default port for MASS-Base worker                               |
														
 
															+| TCP 10150       | Default port for MaaS-Base worker                               |
														
 
															 | TCP 10151       | Default port for worker metrics endpoint                       |
														
 
															 | TCP 40000-40063 | Port range for inference services                              |
														
 
															 | TCP 41000-41999 | Port range for Ray services(vLLM distributed deployment using) |
														
--- a/docs/installation/uninstallation.md
+++ b/docs/installation/uninstallation.md
@@ -1,9 +1,9 @@
 
															 # Uninstallation
														
 
															-MASS-Base is typically installed using containerization, 
														
 
															+MaaS-Base is typically installed using containerization, 
														
 
															 so uninstallation mainly involves removing the container and any associated data volumes.
														
 
															-For example, if MASS-Base is running in a Docker container named `gpustack`, run:
														
 
															+For example, if MaaS-Base is running in a Docker container named `gpustack`, run:
														
 
															 ```bash
														
 
															 docker rm -f gpustack
														
--- a/docs/integrations/inference-apis.md
+++ b/docs/integrations/inference-apis.md
@@ -2,9 +2,9 @@
 
															 ## OpenAI-Compatible APIs
														
 
															-MASS-Base provides [OpenAI-compatible APIs](https://platform.openai.com/docs/api-reference) at the `/v1` endpoint.
														
 
															+MaaS-Base provides [OpenAI-compatible APIs](https://platform.openai.com/docs/api-reference) at the `/v1` endpoint.
														
 
															-You can integrate and use models deployed on MASS-Base with any application or framework that supports the OpenAI-compatible API, simply by pointing it to MASS-Base's OpenAI-compatible endpoint.
														
 
															+You can integrate and use models deployed on MaaS-Base with any application or framework that supports the OpenAI-compatible API, simply by pointing it to MaaS-Base's OpenAI-compatible endpoint.
														
 
															 ### Supported Endpoints
														
@@ -47,7 +47,7 @@ curl http://your_gpustack_server_url/v1/chat/completions \
 
															 ## Anthropic-Compatible APIs
														
 
															-MASS-Base provides the Anthropic-compatible [`/v1/messages` API](https://platform.claude.com/docs/en/api/messages/create).
														
 
															+MaaS-Base provides the Anthropic-compatible [`/v1/messages` API](https://platform.claude.com/docs/en/api/messages/create).
														
 
															 ### Usage
														
@@ -72,7 +72,7 @@ curl http://your_gpustack_server_url/v1/messages \
 
															 In the context of Retrieval-Augmented Generation (RAG), reranking refers to the process of selecting the most relevant information from retrieved documents or knowledge sources before presenting them to the user or utilizing them for answer generation.
														
 
															-Note that the OpenAI-compatible APIs **do not** provide a `rerank` endpoint. To fill this gap, MASS-Base provides a [Jina-compatible Rerank API](https://jina.ai/reranker/) at the `/v1/rerank` path.
														
 
															+Note that the OpenAI-compatible APIs **do not** provide a `rerank` endpoint. To fill this gap, MaaS-Base provides a [Jina-compatible Rerank API](https://jina.ai/reranker/) at the `/v1/rerank` path.
														
 
															 ### Usage
														
@@ -133,7 +133,7 @@ Example output:
 
															 ## Other APIs
														
 
															-For other API types, MASS-Base allows you to enable the **Generic Proxy** feature when deploying a model.
														
 
															+For other API types, MaaS-Base allows you to enable the **Generic Proxy** feature when deploying a model.
														
 
															 Once enabled, there are two ways to address the target model:
														
--- a/docs/integrations/integrate-with-cherrystudio.md
+++ b/docs/integrations/integrate-with-cherrystudio.md
@@ -1,10 +1,10 @@
 
															 # Integrate with CherryStudio
														
 
															-CherryStudio integrates with MASS-Base to leverage locally hosted LLMs, embeddings and reranking capabilities.
														
 
															+CherryStudio integrates with MaaS-Base to leverage locally hosted LLMs, embeddings and reranking capabilities.
														
 
															 ## Deploying Models
														
 
															-1. In MASS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															+1. In MaaS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															     - qwen3-instruct-2507
														
 
															     - qwen2.5-vl-7b
														
@@ -25,9 +25,9 @@ CherryStudio integrates with MASS-Base to leverage locally hosted LLMs, embeddin
 
															 3. Copy the API key and save it for later use.
														
 
															-## Integrating MASS-Base into CherryStudio
														
 
															+## Integrating MaaS-Base into CherryStudio
														
 
															-1. Open CherryStudio, go to `Settings` → `Model Provider`, find MASS-Base, enable it, and configure it as shown:
														
 
															+1. Open CherryStudio, go to `Settings` → `Model Provider`, find MaaS-Base, enable it, and configure it as shown:
														
 
															     - `API Key`: Input the API key you copied from previous steps.
														
--- a/docs/integrations/integrate-with-claude-code.md
+++ b/docs/integrations/integrate-with-claude-code.md
@@ -1,11 +1,11 @@
 
															 # Integrate with Claude Code
														
 
															-Claude Code is an agentic coding tool from Anthropic. Since model deployments on MASS-Base are compatible with the Anthropic API, you can easily connect Claude Code to your MASS-Base deployment and use it for code generation tasks. In this guide, we will walk through the steps to integrate Claude Code with MASS-Base and test the integration by asking Claude to create a Flappy Bird game.
														
 
															+Claude Code is an agentic coding tool from Anthropic. Since model deployments on MaaS-Base are compatible with the Anthropic API, you can easily connect Claude Code to your MaaS-Base deployment and use it for code generation tasks. In this guide, we will walk through the steps to integrate Claude Code with MaaS-Base and test the integration by asking Claude to create a Flappy Bird game.
														
 
															 ## Prerequisites
														
 
															 - One or more GPUs with at least 100 GB of VRAM in total
														
 
															-- MASS-Base installed and running
														
 
															+- MaaS-Base installed and running
														
 
															 - Access to Hugging Face or ModelScope to download model files
														
 
															 !!! note
														
@@ -14,7 +14,7 @@ Claude Code is an agentic coding tool from Anthropic. Since model deployments on
 
															 ## Deploy the Model
														
 
															-1. In the MASS-Base UI, navigate to the **Model Catalog** page.
														
 
															+1. In the MaaS-Base UI, navigate to the **Model Catalog** page.
														
 
															 2. Search for `Qwen3-Coder-Next` and deploy the model using the default configuration.
														
@@ -44,12 +44,12 @@ To easily switch between different model providers, you can use CC-Switch or sim
 
															 Install [CC-Switch](https://github.com/farion1231/cc-switch) following its documentation.
														
 
															-## Configure Claude Code with MASS-Base
														
 
															+## Configure Claude Code with MaaS-Base
														
 
															 1. Open CC-Switch and add a custom provider with the following settings:
														
 
															-   - **Provider Name**: `MASS-Base`
														
 
															-   - **API Endpoint**: Your MASS-Base server URL
														
 
															+   - **Provider Name**: `MaaS-Base`
														
 
															+   - **API Endpoint**: Your MaaS-Base server URL
														
 
															    - **API Key**: The API key you created earlier
														
 
															 2. Configure all models to use `qwen3-coder-next`.
														
@@ -81,4 +81,4 @@ Install [CC-Switch](https://github.com/farion1231/cc-switch) following its docum
 
															 ## Conclusion
														
 
															-In this guide, we successfully integrated Claude Code with MASS-Base and used it to generate a Flappy Bird game. You can now explore more complex coding tasks with Claude Code and leverage the power of MASS-Base for efficient model serving.
														
 
															+In this guide, we successfully integrated Claude Code with MaaS-Base and used it to generate a Flappy Bird game. You can now explore more complex coding tasks with Claude Code and leverage the power of MaaS-Base for efficient model serving.
														
--- a/docs/integrations/integrate-with-dify.md
+++ b/docs/integrations/integrate-with-dify.md
@@ -1,10 +1,10 @@
 
															 # Integrate with Dify
														
 
															-Dify can integrate with MASS-Base to leverage locally deployed LLMs, embeddings, reranking, image generation, Speech-to-Text and Text-to-Speech capabilities.
														
 
															+Dify can integrate with MaaS-Base to leverage locally deployed LLMs, embeddings, reranking, image generation, Speech-to-Text and Text-to-Speech capabilities.
														
 
															 ## Deploying Models
														
 
															-1. In MASS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															+1. In MaaS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															 - qwen3-8b
														
 
															 - qwen2.5-vl-3b-instruct
														
@@ -25,9 +25,9 @@ Dify can integrate with MASS-Base to leverage locally deployed LLMs, embeddings,
 
															 3. Copy the API key and save it for later use.
														
 
															-## Integrating MASS-Base into Dify
														
 
															+## Integrating MaaS-Base into Dify
														
 
															-1. Access the Dify UI, go to the top right corner and click on `PLUGINS`, select `Install from Marketplace`, search for the MASS-Base plugin, and choose to install it.
														
 
															+1. Access the Dify UI, go to the top right corner and click on `PLUGINS`, select `Install from Marketplace`, search for the MaaS-Base plugin, and choose to install it.
														
 
															 ![dify-install-gpustack-plugin](../assets/integrations/integration-dify-install-gpustack-plugin.png)
														
@@ -35,7 +35,7 @@ Dify can integrate with MASS-Base to leverage locally deployed LLMs, embeddings,
 
															 - Model Type: Select the model type based on the model.
														
 
															-- Model Name: The name must match the model name deployed on MASS-Base.
														
 
															+- Model Name: The name must match the model name deployed on MaaS-Base.
														
 
															 - Server URL: `http://your-gpustack-url`, do not use `localhost`, as it refers to the container’s internal network. If you’re using a custom port, make sure to include it. Also, ensure the URL is accessible from inside the Dify container (you can test this with `curl`).
														
--- a/docs/integrations/integrate-with-maxkb.md
+++ b/docs/integrations/integrate-with-maxkb.md
@@ -1,10 +1,10 @@
 
															 # Integrate with MaxKB
														
 
															-MaxKB can integrate with MASS-Base to leverage locally deployed **LLMs, embedding models, and reranking models** for building knowledge-based AI assistants.
														
 
															+MaxKB can integrate with MaaS-Base to leverage locally deployed **LLMs, embedding models, and reranking models** for building knowledge-based AI assistants.
														
 
															 ## Deploying Models
														
 
															-1. In MASS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															+1. In MaaS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															 * `qwen3.5-35b-a3b`
														
@@ -29,7 +29,7 @@ MaxKB can integrate with MASS-Base to leverage locally deployed **LLMs, embeddin
 
															 ## Obtain Model Access Information
														
 
															-1. In the MASS-Base sidebar, open the **Routes** page.
														
 
															+1. In the MaaS-Base sidebar, open the **Routes** page.
														
 
															 2. Click the **More actions menu** next to the route and select **API Access Info**.
														
@@ -85,7 +85,7 @@ admin / MaxKB@123..
 
															 After logging in for the first time, follow the prompt to change the password.
														
 
															-## Integrating MASS-Base into MaxKB
														
 
															+## Integrating MaaS-Base into MaxKB
														
 
															 1. In the MaxKB UI, navigate to **Model** in the top navigation bar.
														
@@ -99,9 +99,9 @@ After logging in for the first time, follow the prompt to change the password.
 
															 When configuring the model:
														
 
															-* **Base Model**: Must match the model name deployed in MASS-Base.
														
 
															+* **Base Model**: Must match the model name deployed in MaaS-Base.
														
 
															 * **API URL**: `http://your-gpustack-url/v1`
														
 
															-* **API Key**: The API key created in MASS-Base.
														
 
															+* **API Key**: The API key created in MaaS-Base.
														
 
															 !!! note
														
@@ -173,4 +173,4 @@ Open the chat interface to start interacting with the assistant.
 
															 ![](../assets/integrations/maxkb/65.png)
														
 
															 ![](../assets/integrations/maxkb/66.png)
														
 
															-The assistant can now answer questions based on the connected knowledge base and models deployed on MASS-Base.
														
 
															+The assistant can now answer questions based on the connected knowledge base and models deployed on MaaS-Base.
														
--- a/docs/integrations/integrate-with-n8n.md
+++ b/docs/integrations/integrate-with-n8n.md
@@ -4,11 +4,11 @@
 
															 ## Deploy the Model
														
 
															-Please refer to the **[Model Deployment](../user-guide/model-deployment-management.md#deploy-model)** section in the MASS-Base documentation to complete model deployment.
														
 
															+Please refer to the **[Model Deployment](../user-guide/model-deployment-management.md#deploy-model)** section in the MaaS-Base documentation to complete model deployment.
														
 
															 ## API Access Info
														
 
															-1. Log in to the MASS-Base Web UI
														
 
															+1. Log in to the MaaS-Base Web UI
														
 
															 2. Navigate to the **Routes** page
														
 
															 3. From the menu on the right side of the target model, select **API Access Info**
														
@@ -25,9 +25,9 @@ Record the following information (if an API Key has not been created yet, follow
 
															 Follow the official n8n documentation to complete a self-hosted installation, or use the n8n Cloud service directly:
														
 
															 [https://docs.n8n.io/hosting/](https://docs.n8n.io/hosting/)
														
 
															-## Integrating MASS-Base in n8n
														
 
															+## Integrating MaaS-Base in n8n
														
 
															-Since MASS-Base provides an OpenAI-compatible API, you can directly use the OpenAI nodes in n8n for configuration:
														
 
															+Since MaaS-Base provides an OpenAI-compatible API, you can directly use the OpenAI nodes in n8n for configuration:
														
 
															 1. Add a **Credential** in n8n
														
@@ -39,7 +39,7 @@ Since MASS-Base provides an OpenAI-compatible API, you can directly use the Open
 
															    ![](../assets/integrations/n8n-05-02.png)
														
 
															-2. Use the MASS-Base Credential
														
 
															+2. Use the MaaS-Base Credential
														
 
															    ![](../assets/integrations/n8n-06.png)
														
 
															    ![](../assets/integrations/n8n-07.png)
														
--- a/docs/integrations/integrate-with-openclaw.md
+++ b/docs/integrations/integrate-with-openclaw.md
@@ -4,11 +4,11 @@
 
															 ## Deploy the Model
														
 
															-Please refer to the [**Model Deployment**](../user-guide/model-deployment-management.md#deploy-model) section in the MASS-Base documentation to complete model deployment.
														
 
															+Please refer to the [**Model Deployment**](../user-guide/model-deployment-management.md#deploy-model) section in the MaaS-Base documentation to complete model deployment.
														
 
															 ## API Access Info
														
 
															-1. Log in to the MASS-Base Web UI
														
 
															+1. Log in to the MaaS-Base Web UI
														
 
															 2. Navigate to the **Routes** page
														
 
															 3. From the menu on the right side of the target model, select **API Access Info**
														
@@ -27,7 +27,7 @@ Record the following information (if an API Key has not been created yet, follow
 
															 Follow the official OpenClaw documentation to complete the installation:
														
 
															 [https://docs.openclaw.ai/install](https://docs.openclaw.ai/install)
														
 
															-## Configure MASS-Base in OpenClaw
														
 
															+## Configure MaaS-Base in OpenClaw
														
 
															 1. Start the interactive configuration wizard:
														
@@ -39,7 +39,7 @@ Follow the official OpenClaw documentation to complete the installation:
 
															    ![](../assets/integrations/openclaw-03.png)
														
 
															-3. Fill in the information provided by MASS-Base as prompted:
														
 
															+3. Fill in the information provided by MaaS-Base as prompted:
														
 
															     * **API Base URL**: Access URL
														
 
															     * **API Key**: API Key
														
@@ -47,7 +47,7 @@ Follow the official OpenClaw documentation to complete the installation:
 
															    ![](../assets/integrations/openclaw-04.png)
														
 
															-After completing these steps, OpenClaw will use MASS-Base to invoke the corresponding model for inference.
														
 
															+After completing these steps, OpenClaw will use MaaS-Base to invoke the corresponding model for inference.
														
 
															 ## Configure Channels
														
--- a/docs/integrations/integrate-with-ragflow.md
+++ b/docs/integrations/integrate-with-ragflow.md
@@ -1,10 +1,10 @@
 
															 # Integrate with RAGFlow
														
 
															-RAGFlow can integrate with MASS-Base to leverage locally deployed LLMs, embeddings, reranking, Speech-to-Text and Text-to-Speech capabilities.
														
 
															+RAGFlow can integrate with MaaS-Base to leverage locally deployed LLMs, embeddings, reranking, Speech-to-Text and Text-to-Speech capabilities.
														
 
															 ## Deploying Models
														
 
															-1. In MASS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															+1. In MaaS-Base UI, navigate to the `Deployments` page and click on `Deploy Model` to deploy the models you need. Here are some example models:
														
 
															 - qwen3-8b
														
 
															 - qwen2.5-vl-3b-instruct
														
@@ -25,13 +25,13 @@ RAGFlow can integrate with MASS-Base to leverage locally deployed LLMs, embeddin
 
															 3. Copy the API key and save it for later use.
														
 
															-## Integrating MASS-Base into RAGFlow
														
 
															+## Integrating MaaS-Base into RAGFlow
														
 
															 1. Access the RAGFlow UI, go to the top right corner and click the avatar, select `Model Providers > GPUStack`, then select `Add the model` and fill in:
														
 
															 - Model type: Select the model type based on the model.
														
 
															-- Model name: The name must match the model name deployed on MASS-Base.
														
 
															+- Model name: The name must match the model name deployed on MaaS-Base.
														
 
															 - Base URL: `http://your-gpustack-url/v1`, the URL should not include the path and do not use `localhost`, as it refers to the container’s internal network. If you’re using a custom port, make sure to include it. Also, ensure the URL is accessible from inside the RAGFlow container (you can test this with `curl`).
														
--- a/docs/migration.md
+++ b/docs/migration.md
@@ -2,9 +2,9 @@
 
															 !!! note
														
 
															-    Since v2.0.0, MASS-Base Worker officially supports only Linux. If you are using Windows or macOS, please move your data directory to a Linux system to perform the migration.
														
 
															+    Since v2.0.0, MaaS-Base Worker officially supports only Linux. If you are using Windows or macOS, please move your data directory to a Linux system to perform the migration.
														
 
															-    On Windows and macOS, MASS-Base Server (without the embedded worker) can still be run using Docker Desktop.
														
 
															+    On Windows and macOS, MaaS-Base Server (without the embedded worker) can still be run using Docker Desktop.
														
 
															 ## Before Migration
														
@@ -12,7 +12,7 @@
 
															 #### 1. Removal of Ollama Model Source (since v0.7.x)
														
 
															-- **Change:** Starting from version 0.7, MASS-Base no longer supports `ollama` as a model source.
														
 
															+- **Change:** Starting from version 0.7, MaaS-Base no longer supports `ollama` as a model source.
														
 
															 - **Impact:** Models, Model Files, and Model Instances whose source is `ollama` will not be preserved during the upgrade process.
														
 
															 - **Action Required:**  If you are upgrading from a version earlier than v0.7 and currently have models deployed from the `ollama` source, you must migrate these models manually before upgrading.  
														
 
															   We recommend re-deploying affected models using one of the supported sources:
														
@@ -20,7 +20,7 @@
 
															     - ModelScope
														
 
															     - Local path
														
 
															-    You can perform this migration by re-deploying the models through the **MASS-Base UI** before initiating the upgrade.
														
 
															+    You can perform this migration by re-deploying the models through the **MaaS-Base UI** before initiating the upgrade.
														
 
															 ### Backup Your Data
														
@@ -28,7 +28,7 @@
 
															       **Backup First:** Before starting the server migration, it’s strongly recommended to back up your database.
														
 
															-      For default installations on v0.7 or earlier, stop the MASS-Base server and create a backup of data dir located inside the container at:
														
 
															+      For default installations on v0.7 or earlier, stop the MaaS-Base server and create a backup of data dir located inside the container at:
														
 
															       ```
														
 
															       /var/lib/gpustack
														
@@ -36,15 +36,15 @@
 
															 Please go through to the [Installation Requirements](./installation/requirements.md) before starting the migration.
														
 
															-If you used MASS-Base **without Docker** in versions prior to v0.7.1(for example, via pip install or an installation script), please install Docker by following the Docker Engine [Installation Guide](https://docs.docker.com/engine/install/) before proceeding with the migration.
														
 
															+If you used MaaS-Base **without Docker** in versions prior to v0.7.1(for example, via pip install or an installation script), please install Docker by following the Docker Engine [Installation Guide](https://docs.docker.com/engine/install/) before proceeding with the migration.
														
 
															-If you used GPU acceleration for inference in MASS-Base prior to v0.7.1, please check whether you need to install the corresponding accelerator runtime’s Container Toolkit or Container Runtime after installing Docker. You can follow the steps in the **Installation Requirements** to check and install them.
														
 
															+If you used GPU acceleration for inference in MaaS-Base prior to v0.7.1, please check whether you need to install the corresponding accelerator runtime’s Container Toolkit or Container Runtime after installing Docker. You can follow the steps in the **Installation Requirements** to check and install them.
														
 
															 ## Migration Steps
														
 
															 ### Identify Your Legacy Data Directory
														
 
															-Locate the data directory used by your previous MASS-Base installation. The default path is:
														
 
															+Locate the data directory used by your previous MaaS-Base installation. The default path is:
														
 
															 ```
														
 
															 /var/lib/gpustack
														
@@ -70,9 +70,9 @@ Since v2.0.0, you no longer need to specify the GPU computing platform or versio
 
															 #### Embedded Database Migration (SQLite → PostgreSQL)
														
 
															-In v0.7 and earlier, MASS-Base used an embedded SQLite database by default to store management data. Starting from v2.0.0, MASS-Base dropped SQLite support and now uses an embedded PostgreSQL database by default for improved performance and scalability.
														
 
															+In v0.7 and earlier, MaaS-Base used an embedded SQLite database by default to store management data. Starting from v2.0.0, MaaS-Base dropped SQLite support and now uses an embedded PostgreSQL database by default for improved performance and scalability.
														
 
															-Start the MASS-Base with the `GPUSTACK_DATA_MIGRATION=true` to enable the embedded database migration. Replace `${your-data-dir}` with your legacy data directory containing the original SQLite database and related files:
														
 
															+Start the MaaS-Base with the `GPUSTACK_DATA_MIGRATION=true` to enable the embedded database migration. Replace `${your-data-dir}` with your legacy data directory containing the original SQLite database and related files:
														
 
															 ```bash
														
 
															 sudo docker run -d --name gpustack \
														
@@ -101,7 +101,7 @@ Also customizing the `--data-dir`, `GPUSTACK_DATA_DIR` is also supported in data
 
															 #### External Database Migration
														
 
															-MASS-Base supports using an external database to store the management data. If you previously deployed MASS-Base with an external database, start the server with the following command:
														
 
															+MaaS-Base supports using an external database to store the management data. If you previously deployed MaaS-Base with an external database, start the server with the following command:
														
 
															 ```bash
														
 
															 sudo docker run -d --name gpustack-server \
														
@@ -134,7 +134,7 @@ sudo docker run -d --name gpustack-worker \
 
															 Please make sure both `--volume /var/run/docker.sock:/var/run/docker.sock` and `--runtime nvidia` are added to the docker command. Those are not required for previous version. For different accelerator runtime, Refer to [Other GPU Architectures](#other-gpu-architectures) to use different option from `--runtime nvidia`.
														
 
															-This will launch the MASS-Base worker using your existing data and connect it to the specified server.
														
 
															+This will launch the MaaS-Base worker using your existing data and connect it to the specified server.
														
 
															 ### Other GPU Architectures
														
--- a/docs/overview.md
+++ b/docs/overview.md
@@ -27,7 +27,7 @@
 
															 ## Overview
														
 
															-MASS-Base is an open-source GPU cluster manager designed for efficient AI model deployment. It configures and orchestrates inference engines — vLLM, SGLang, TensorRT-LLM, or your own — to optimize performance across GPU clusters.
														
 
															+MaaS-Base is an open-source GPU cluster manager designed for efficient AI model deployment. It configures and orchestrates inference engines — vLLM, SGLang, TensorRT-LLM, or your own — to optimize performance across GPU clusters.
														
 
															 <div class="grid cards" markdown>
														
@@ -47,7 +47,7 @@ MASS-Base is an open-source GPU cluster manager designed for efficient AI model
 
															     ---
														
 
															-    MASS-Base's pluggable engine architecture enables you to deploy new models on the day they are released.
														
 
															+    MaaS-Base's pluggable engine architecture enables you to deploy new models on the day they are released.
														
 
															 -   :material-speedometer:{ .lg .middle .icon-red } __Performance-Optimized__
														
@@ -65,15 +65,15 @@ MASS-Base is an open-source GPU cluster manager designed for efficient AI model
 
															 ## Architecture
														
 
															-MASS-Base enables development teams, IT organizations, and service providers to deliver Model-as-a-Service at scale. It supports industry-standard APIs for LLM, voice, image, and video models. The platform includes built-in user authentication and access control, real-time monitoring of GPU performance and utilization, and detailed metering of token usage and API request rates.
														
 
															+MaaS-Base enables development teams, IT organizations, and service providers to deliver Model-as-a-Service at scale. It supports industry-standard APIs for LLM, voice, image, and video models. The platform includes built-in user authentication and access control, real-time monitoring of GPU performance and utilization, and detailed metering of token usage and API request rates.
														
 
															-The figure below illustrates how a single MASS-Base server can manage multiple GPU clusters across both on-premises and cloud environments. The MASS-Base scheduler allocates GPUs to maximize resource utilization and selects the appropriate inference engines for optimal performance. Administrators also gain full visibility into system health and metrics through integrated Grafana and Prometheus dashboards.
														
 
															+The figure below illustrates how a single MaaS-Base server can manage multiple GPU clusters across both on-premises and cloud environments. The MaaS-Base scheduler allocates GPUs to maximize resource utilization and selects the appropriate inference engines for optimal performance. Administrators also gain full visibility into system health and metrics through integrated Grafana and Prometheus dashboards.
														
 
															 ![gpustack-v2-architecture](assets/gpustack-v2-architecture.png)
														
 
															 ## Optimized Inference Performance
														
 
															-MASS-Base's automated engine selection and parameter optimization deliver strong inference performance out of the box. The following figure shows throughput improvements over default vLLM configurations:
														
 
															+MaaS-Base's automated engine selection and parameter optimization deliver strong inference performance out of the box. The following figure shows throughput improvements over default vLLM configurations:
														
 
															 ![a100-throughput-comparison](assets/a100-throughput-comparison.png)
														
@@ -81,7 +81,7 @@ For detailed benchmarking methods and results, visit our [Inference Performance
 
															 ## Supported Accelerators
														
 
															-MASS-Base supports a wide range of accelerators for AI inference:
														
 
															+MaaS-Base supports a wide range of accelerators for AI inference:
														
 
															 <div class="logo-tile-grid">
														
 
															     <div class="logo-tile" data-tooltip="NVIDIA GPU">
														
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -1,17 +1,17 @@
 
															 # Quickstart
														
 
															-This guide will walk you through running MASS-Base on your own self-hosted GPU servers. To use [cloud GPUs](./tutorials/adding-gpucluster-using-digitalocean.md), or integrating with an [existing Kubernetes cluster](./tutorials/adding-gpucluster-using-kubernetes.md), see the relevant tutorials.
														
 
															+This guide will walk you through running MaaS-Base on your own self-hosted GPU servers. To use [cloud GPUs](./tutorials/adding-gpucluster-using-digitalocean.md), or integrating with an [existing Kubernetes cluster](./tutorials/adding-gpucluster-using-kubernetes.md), see the relevant tutorials.
														
 
															 !!! info "Prerequisites"
														
 
															-    1. A node with at least one NVIDIA GPU. For other GPU types, please check the guidelines in the MASS-Base UI when adding a worker, or refer to the [Installation documentation](./installation/requirements.md) for more details.
														
 
															+    1. A node with at least one NVIDIA GPU. For other GPU types, please check the guidelines in the MaaS-Base UI when adding a worker, or refer to the [Installation documentation](./installation/requirements.md) for more details.
														
 
															     2. Ensure the NVIDIA driver, [Docker](https://docs.docker.com/engine/install/) and [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) are installed on the worker node.
														
 
															-    3. **(Optional)** A CPU node for hosting the MASS-Base server. The MASS-Base server does not require a GPU and can run on a CPU-only machine. [Docker](https://docs.docker.com/engine/install/) must be installed. Docker Desktop (for Windows and macOS) is also supported. If no dedicated CPU node is available, the MASS-Base server can be installed on the same machine as a GPU worker node.
														
 
															-    4. Only Linux is supported for MASS-Base worker nodes. If you use Windows, consider using WSL2 and avoid using Docker Desktop. macOS is not supported for MASS-Base worker nodes.
														
 
															+    3. **(Optional)** A CPU node for hosting the MaaS-Base server. The MaaS-Base server does not require a GPU and can run on a CPU-only machine. [Docker](https://docs.docker.com/engine/install/) must be installed. Docker Desktop (for Windows and macOS) is also supported. If no dedicated CPU node is available, the MaaS-Base server can be installed on the same machine as a GPU worker node.
														
 
															+    4. Only Linux is supported for MaaS-Base worker nodes. If you use Windows, consider using WSL2 and avoid using Docker Desktop. macOS is not supported for MaaS-Base worker nodes.
														
 
															-## Install MASS-Base
														
 
															+## Install MaaS-Base
														
 
															-Run the following command to install and start the MASS-Base server using [Docker](https://docs.docker.com/engine/install/):
														
 
															+Run the following command to install and start the MaaS-Base server using [Docker](https://docs.docker.com/engine/install/):
														
 
															 ```bash
														
 
															 sudo docker run -d --name gpustack \
														
@@ -34,23 +34,23 @@ sudo docker run -d --name gpustack \
 
															         --system-default-container-registry quay.io
														
 
															     ```
														
 
															-Check the MASS-Base startup logs:
														
 
															+Check the MaaS-Base startup logs:
														
 
															 ```bash
														
 
															 sudo docker logs -f gpustack
														
 
															 ```
														
 
															-After MASS-Base starts, run the following command to get the default admin password:
														
 
															+After MaaS-Base starts, run the following command to get the default admin password:
														
 
															 ```bash
														
 
															 sudo docker exec gpustack cat /var/lib/gpustack/initial_admin_password
														
 
															 ```
														
 
															-Open your browser and navigate to `http://your_host_ip` to access the MASS-Base UI. Use the default username `admin` and the password you retrieved above to log in.
														
 
															+Open your browser and navigate to `http://your_host_ip` to access the MaaS-Base UI. Use the default username `admin` and the password you retrieved above to log in.
														
 
															 ## Set Up a GPU Cluster
														
 
															-1. On the MASS-Base UI, navigate to the `Clusters` page.
														
 
															+1. On the MaaS-Base UI, navigate to the `Clusters` page.
														
 
															 2. Click the `Add Cluster` button.
														
@@ -74,13 +74,13 @@ sudo docker run -d --name gpustack-worker \
 
															       --advertise-address 192.168.1.2
														
 
															 ```
														
 
															-6. Execute the command on the worker node to connect it to the MASS-Base server.
														
 
															+6. Execute the command on the worker node to connect it to the MaaS-Base server.
														
 
															-7. After the worker node connects successfully, it will appear on the `Workers` page in the MASS-Base UI.
														
 
															+7. After the worker node connects successfully, it will appear on the `Workers` page in the MaaS-Base UI.
														
 
															 ## Deploy a Model
														
 
															-1. Navigate to the `Catalog` page in the MASS-Base UI.
														
 
															+1. Navigate to the `Catalog` page in the MaaS-Base UI.
														
 
															 2. Select the `Qwen3-0.6B` model from the list of available models.
														
@@ -92,7 +92,7 @@ sudo docker run -d --name gpustack-worker \
 
															 !!! note
														
 
															-    MASS-Base uses containers to run models. The first-time model deployment may take some time to download the model files and container images. You can click `View Logs` in the UI to monitor the deployment progress.
														
 
															+    MaaS-Base uses containers to run models. The first-time model deployment may take some time to download the model files and container images. You can click `View Logs` in the UI to monitor the deployment progress.
														
 
															 ![model is running](assets/quick-start/model-running.png)
														
@@ -108,7 +108,7 @@ sudo docker run -d --name gpustack-worker \
 
															 3. Copy the generated API key and save it somewhere safe. Please note that you can only see it once on creation.
														
 
															-4. You can now use the API key to access the OpenAI-compatible API endpoints provided by MASS-Base. For example, use curl as the following:
														
 
															+4. You can now use the API key to access the OpenAI-compatible API endpoints provided by MaaS-Base. For example, use curl as the following:
														
 
															 ```bash
														
 
															 # Replace `your_api_key` and `your_gpustack_server_url`
														
@@ -135,4 +135,4 @@ curl http://your_gpustack_server_url/v1/chat/completions \
 
															 ## Cleanup
														
 
															-After you complete using the deployed model, you can go to the `Deployments` page in the MASS-Base UI and delete the model to free up resources.
														
 
															+After you complete using the deployed model, you can go to the `Deployments` page in the MaaS-Base UI and delete the model to free up resources.
														
--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -1,8 +1,8 @@
 
															 # Troubleshooting
														
 
															-## View MASS-Base Logs
														
 
															+## View MaaS-Base Logs
														
 
															-You can view MASS-Base logs with the following commands for the default setup:
														
 
															+You can view MaaS-Base logs with the following commands for the default setup:
														
 
															 ```bash
														
 
															 docker logs -f gpustack
														
@@ -10,7 +10,7 @@ docker logs -f gpustack
 
															 ## Enable Debug Mode
														
 
															-You can enable the `DEBUG` mode by setting the `--debug` flag when running MASS-Base:
														
 
															+You can enable the `DEBUG` mode by setting the `--debug` flag when running MaaS-Base:
														
 
															 ```diff
														
 
															 sudo docker run -d --name gpustack \
														
@@ -20,7 +20,7 @@ sudo docker run -d --name gpustack \
 
															     ...
														
 
															 ```
														
 
															-You can also enable MASS-Base's debug mode at runtime by running the following command inside the **server container**:
														
 
															+You can also enable MaaS-Base's debug mode at runtime by running the following command inside the **server container**:
														
 
															 ```bash
														
 
															 gpustack reload-config --set debug=true
														
@@ -28,13 +28,13 @@ gpustack reload-config --set debug=true
 
															 ## Configure Log Level
														
 
															-You can configure log level of the MASS-Base server at runtime by running the following command inside the **server container**:
														
 
															+You can configure log level of the MaaS-Base server at runtime by running the following command inside the **server container**:
														
 
															 ```bash
														
 
															 curl -X PUT http://localhost/debug/log_level -d "debug"
														
 
															 ```
														
 
															-The same applies to MASS-Base workers:
														
 
															+The same applies to MaaS-Base workers:
														
 
															 ```bash
														
 
															 curl -X PUT http://localhost:10150/debug/log_level -d "debug"
														
@@ -50,7 +50,7 @@ In case you forgot the admin password, you can reset it by running the following
 
															 gpustack reset-admin-password
														
 
															 ```
														
 
															-If you changed the default port using `--port` when starting MASS-Base, specify the MASS-Base URL using the `--server-url` parameter. It must be run locally on the server and accessed via `localhost`:
														
 
															+If you changed the default port using `--port` when starting MaaS-Base, specify the MaaS-Base URL using the `--server-url` parameter. It must be run locally on the server and accessed via `localhost`:
														
 
															 ```bash
														
 
															 gpustack reset-admin-password --server-url http://localhost:9090
														
@@ -58,9 +58,9 @@ gpustack reset-admin-password --server-url http://localhost:9090
 
															 ## Assist in Accelerators Detection Diagnosis
														
 
															-After successfully deploying the MASS-Base Worker as described in the [installation guide](./installation/requirements.md),  
														
 
															+After successfully deploying the MaaS-Base Worker as described in the [installation guide](./installation/requirements.md),  
														
 
															 if the Worker fails to detect any devices,  
														
 
															-please enter the corresponding Worker container, run the following command, and report the results to [MASS-Base](https://github.com/gpustack/gpustack/issues).
														
 
															+please enter the corresponding Worker container, run the following command, and report the results to [MaaS-Base](https://github.com/gpustack/gpustack/issues).
														
 
															 ```bash
														
 
															 time GPUSTACK_RUNTIME_LOG_LEVEL=debug GPUSTACK_RUNTIME_LOG_EXCEPTION=1 gpustack-runtime detect --format json
														
@@ -69,7 +69,7 @@ time GPUSTACK_RUNTIME_LOG_LEVEL=debug GPUSTACK_RUNTIME_LOG_EXCEPTION=1 gpustack-
 
															 ## Assist in Model Deployment Diagnosis
														
 
															 If you experience issues after deploying a model, 
														
 
															-PLEASE enter the corresponding Worker container, run the following command, and report the results to [MASS-Base](https://github.com/gpustack/gpustack/issues).
														
 
															+PLEASE enter the corresponding Worker container, run the following command, and report the results to [MaaS-Base](https://github.com/gpustack/gpustack/issues).
														
 
															 ```bash
														
 
															 gpustack-runtime inspect <model instance name>
														
--- a/docs/tutorials/adding-gpucluster-using-digitalocean.md
+++ b/docs/tutorials/adding-gpucluster-using-digitalocean.md
@@ -10,7 +10,7 @@ You need to sign up for a DigitalOcean account and create a Personal Access Toke
 
															 > Note: The token scope must be set to Full Access. If you select permissions using Custom Scopes, you may encounter issues deleting droplets.
														
 
															-When starting the MASS-Base Server, you need to specify the `--server-external-url` parameter. This parameter is used to configure the worker's `--server-url` after the droplet is created and the worker is started. If your server is running behind a proxy, please set the proxy address to ensure that droplets running on the public network can access the MASS-Base Server API using this address after startup.
														
 
															+When starting the MaaS-Base Server, you need to specify the `--server-external-url` parameter. This parameter is used to configure the worker's `--server-url` after the droplet is created and the worker is started. If your server is running behind a proxy, please set the proxy address to ensure that droplets running on the public network can access the MaaS-Base Server API using this address after startup.
														
 
															 ## Create DigitalOcean Cluster
														
--- a/docs/tutorials/inference-on-cpus.md
+++ b/docs/tutorials/inference-on-cpus.md
@@ -1,6 +1,6 @@
 
															 # Running Inference on CPUs
														
 
															-MASS-Base supports inference on CPUs, offering flexibility when GPU resources are limited or when model sizes exceed allocatable GPU memory. The following CPU inference modes are available:
														
 
															+MaaS-Base supports inference on CPUs, offering flexibility when GPU resources are limited or when model sizes exceed allocatable GPU memory. The following CPU inference modes are available:
														
 
															 - **Hybrid CPU+GPU Inference**: Enables partial acceleration by offloading portions of large models to the CPU when VRAM capacity is insufficient.
														
 
															 - **Full CPU Inference**: Runs entirely on CPU when no GPU resources are available.
														
@@ -9,7 +9,7 @@ MASS-Base supports inference on CPUs, offering flexibility when GPU resources ar
 
															     Available for custom backends only.
														
 
															-    When CPU offloading is enabled, MASS-Base will allocate CPU memory if GPU resources are insufficient. You must correctly configure the inference backend to use hybrid CPU+GPU or full CPU inference.
														
 
															+    When CPU offloading is enabled, MaaS-Base will allocate CPU memory if GPU resources are insufficient. You must correctly configure the inference backend to use hybrid CPU+GPU or full CPU inference.
														
 
															     It is strongly recommended to use CPU inference only on CPU workers.
														
@@ -31,9 +31,9 @@ Execution Command: `--model-id BAAI/bge-large-en-v1.5 --huggingface-hub-cache /v
 
															     `ghcr.io/huggingface/text-embeddings-inference:cpu-1.8` is the CPU inference image for TEI. See: [TEI Supported Hardware](https://huggingface.co/docs/text-embeddings-inference/supported_models#supported-hardware).
														
 
															-    `--huggingface-hub-cache /var/lib/gpustack/cache/huggingface` sets the location of the HuggingFace Hub cache for TEI to the path where MASS-Base stores downloaded HuggingFace models. The default path is `/var/lib/gpustack/cache/huggingface`. See: [TEI CLI Arguments](https://huggingface.co/docs/text-embeddings-inference/cli_arguments).
														
 
															+    `--huggingface-hub-cache /var/lib/gpustack/cache/huggingface` sets the location of the HuggingFace Hub cache for TEI to the path where MaaS-Base stores downloaded HuggingFace models. The default path is `/var/lib/gpustack/cache/huggingface`. See: [TEI CLI Arguments](https://huggingface.co/docs/text-embeddings-inference/cli_arguments).
														
 
															-    `{{port}}` is a placeholder that represents the port automatically assigned by MASS-Base.
														
 
															+    `{{port}}` is a placeholder that represents the port automatically assigned by MaaS-Base.
														
 
															 ![TEI CPU Inference](../assets/tutorials/inference-on-cpus/tei-cpu-inference.png)
														
--- a/docs/tutorials/inference-with-tool-calling.md
+++ b/docs/tutorials/inference-with-tool-calling.md
@@ -2,7 +2,7 @@
 
															 Tool calling allows you to connect models to external tools and systems. This is useful for many things such as empowering AI assistants with capabilities, or building deep integrations between your applications and the models.
														
 
															-In this tutorial, you’ll learn how to set up and use tool calling within MASS-Base to extend your AI’s capabilities.
														
 
															+In this tutorial, you’ll learn how to set up and use tool calling within MaaS-Base to extend your AI’s capabilities.
														
 
															 !!! note
														
@@ -13,7 +13,7 @@ In this tutorial, you’ll learn how to set up and use tool calling within MASS-
 
															 Before proceeding, ensure the following:
														
 
															-- MASS-Base is installed and running.
														
 
															+- MaaS-Base is installed and running.
														
 
															 - A Linux worker node with a GPU is available. We'll use [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) as the model for this tutorial. The model requires a GPU with at least 18GB VRAM.
														
 
															 - Access to Hugging Face for downloading the model files.
														
@@ -27,7 +27,7 @@ LLMs that support tool calling are marked with the `tools` capability in the cat
 
															 When you deploy GGUF models using llama-box, tool calling is enabled by default for models that support it.
														
 
															-1. Navigate to the `Deployments` page in the MASS-Base UI and click the `Deploy Model` button. In the dropdown, select `Hugging Face` as the source for your model.
														
 
															+1. Navigate to the `Deployments` page in the MaaS-Base UI and click the `Deploy Model` button. In the dropdown, select `Hugging Face` as the source for your model.
														
 
															 2. Enable the `GGUF` checkbox to filter models by GGUF format.
														
 
															 3. Use the search bar to find the `Qwen/Qwen2.5-7B-Instruct-GGUF` model.
														
 
															 4. Click the `Save` button to deploy the model.
														
@@ -38,7 +38,7 @@ When you deploy GGUF models using llama-box, tool calling is enabled by default
 
															 When you deploy models using vLLM, you need to enable tool calling with additional parameters.
														
 
															-1. Navigate to the `Deployments` page in the MASS-Base UI and click the `Deploy Model` button. In the dropdown, select `Hugging Face` as the source for your model.
														
 
															+1. Navigate to the `Deployments` page in the MaaS-Base UI and click the `Deploy Model` button. In the dropdown, select `Hugging Face` as the source for your model.
														
 
															 2. Use the search bar to find the `Qwen/Qwen2.5-7B-Instruct` model.
														
 
															 3. Expand the `Advanced` section in configurations and scroll down to the `Backend Parameters` section.
														
 
															 4. Click on the `Add Parameter` button and add the following parameters:
														
@@ -54,7 +54,7 @@ After deployment, you can monitor the model's status on the `Deployments` page.
 
															 ## Step 2: Generate an API Key
														
 
															-We will use the MASS-Base API to interact with the model. To do this, you need to generate an API key:
														
 
															+We will use the MaaS-Base API to interact with the model. To do this, you need to generate an API key:
														
 
															 1. Hover over the user avatar and navigate to the `API Keys` page.
														
 
															 2. Click the `New API Key` button.
														
@@ -63,7 +63,7 @@ We will use the MASS-Base API to interact with the model. To do this, you need t
 
															 ## Step 3: Do Inference
														
 
															-With the model deployed and an API key, you can call the model via the MASS-Base API. Here is an example script using `curl` (replace `<your-server-url>` with your GPUStack server URL and `<your-api-key>` with the API key generated in the previous step):
														
 
															+With the model deployed and an API key, you can call the model via the MaaS-Base API. Here is an example script using `curl` (replace `<your-server-url>` with your GPUStack server URL and `<your-api-key>` with the API key generated in the previous step):
														
 
															 ```bash
														
 
															 export GPUSTACK_SERVER_URL=<your-server-url>
														
--- a/docs/tutorials/running-deepseek-r1-671b-with-distributed-vllm.md
+++ b/docs/tutorials/running-deepseek-r1-671b-with-distributed-vllm.md
@@ -2,7 +2,7 @@
 
															 This tutorial guides you through the process of configuring and running the original **DeepSeek R1 671B** using **Distributed vLLM** on a GPUStack cluster. Due to the extremely large size of the model, distributed inference across multiple workers is usually required.
														
 
															-MASS-Base enables easy setup and orchestration of distributed inference using vLLM, making it possible to run massive models like DeepSeek R1 with minimal manual configuration.
														
 
															+MaaS-Base enables easy setup and orchestration of distributed inference using vLLM, making it possible to run massive models like DeepSeek R1 with minimal manual configuration.
														
 
															 ## Prerequisites
														
@@ -20,7 +20,7 @@ Before you begin, make sure the following requirements are met:
 
															 </div>
														
 
															 - High-speed interconnects such as NVLink or InfiniBand are recommended for optimal performance.
														
 
															-- Model files should be downloaded to the same path on each node. While MASS-Base supports on-the-fly model downloading, pre-downloading is recommended as it can be time consuming depending on the network speed.
														
 
															+- Model files should be downloaded to the same path on each node. While MaaS-Base supports on-the-fly model downloading, pre-downloading is recommended as it can be time consuming depending on the network speed.
														
 
															 !!! note
														
@@ -29,7 +29,7 @@ Before you begin, make sure the following requirements are met:
 
															 ## Step 1: Install GPUStack Server
														
 
															-According to the [Installation](../installation/installation.md), you can use the following command to start the MASS-Base server:
														
 
															+According to the [Installation](../installation/installation.md), you can use the following command to start the MaaS-Base server:
														
 
															 ```bash
														
 
															 sudo docker run -d --name gpustack \
														
@@ -45,7 +45,7 @@ sudo docker run -d --name gpustack \
 
															     - Replace `/path/to/your/model` with the actual path.
														
 
															-After MASS-Base server is up and running, run the following commands to get the initial admin password:
														
 
															+After MaaS-Base server is up and running, run the following commands to get the initial admin password:
														
 
															 ```bash
														
 
															 sudo docker exec gpustack \
														
@@ -53,19 +53,19 @@ sudo docker exec gpustack \
 
															 ```
														
 
															-## Step 2: Access MASS-Base UI
														
 
															+## Step 2: Access MaaS-Base UI
														
 
															-Login to the MASS-Base UI using the `admin` user and the obtained password.
														
 
															+Login to the MaaS-Base UI using the `admin` user and the obtained password.
														
 
															 ```
														
 
															 http://your_gpustack_server_ip_or_hostname
														
 
															 ```
														
 
															-## Step 3: Install MASS-Base Workers
														
 
															+## Step 3: Install MaaS-Base Workers
														
 
															-Navigate to the `Workers` page in the MASS-Base UI, click `Add Worker` button to get the command for adding workers.
														
 
															+Navigate to the `Workers` page in the MaaS-Base UI, click `Add Worker` button to get the command for adding workers.
														
 
															-And then on **each worker node**, run the worker adding command to start a MASS-Base worker:
														
 
															+And then on **each worker node**, run the worker adding command to start a MaaS-Base worker:
														
 
															 ```bash
														
 
															 sudo docker run -d --name gpustack \
														
@@ -87,7 +87,7 @@ sudo docker run -d --name gpustack \
 
															     - Replace the placeholder paths, IP address/hostname, and cluster token accordingly.
														
 
															     - Replace `/path/to/your/model` with the actual path on your system where the DeepSeek R1 model files are stored.
														
 
															-After all workers are added, return to the MASS-Base UI.
														
 
															+After all workers are added, return to the MaaS-Base UI.
														
 
															 Navigate to the `Workers` page to verify that all workers are in the Ready state and their GPUs are listed.
														
@@ -117,7 +117,7 @@ After the model is running, navigate to the `Workers` page to check GPU utilizat
 
															 ## Step 6: Run Inference via Playground
														
 
															-Once the model is deployed and running, you can test it using the MASS-Base Playground.
														
 
															+Once the model is deployed and running, you can test it using the MaaS-Base Playground.
														
 
															 1. Navigate to the `Playground` -> `Chat`.
														
 
															 2. If only one model is deployed, it will be selected by default. Otherwise, use the dropdown menu to choose `DeepSeek-R1`.
														
@@ -129,6 +129,6 @@ You can also use the `Compare` tab to test concurrent inference scenarios.
 
															 ![playground-compare](../assets/tutorials/running-deepseek-r1-671b-with-distributed-vllm/playground-compare.png)
														
 
															-You have now successfully deployed and run DeepSeek R1 671B using Distributed vLLM on a MASS-Base cluster. Explore the model’s performance and capabilities in your own applications.
														
 
															+You have now successfully deployed and run DeepSeek R1 671B using Distributed vLLM on a MaaS-Base cluster. Explore the model’s performance and capabilities in your own applications.
														
 
															-For further assistance, feel free to reach out to the MASS-Base community or support team.
														
 
															+For further assistance, feel free to reach out to the MaaS-Base community or support team.
														
--- a/docs/tutorials/using-custom-backends.md
+++ b/docs/tutorials/using-custom-backends.md
@@ -1,14 +1,14 @@
 
															 # Using Custom Inference Backends
														
 
															-This guide explains how to add custom inference backends in MASS-Base, including using verified community configurations and creating your own from scratch.
														
 
															+This guide explains how to add custom inference backends in MaaS-Base, including using verified community configurations and creating your own from scratch.
														
 
															 For parameter descriptions, see the [User Guide](../user-guide/inference-backend-management.md).
														
 
															 ## Backend Types
														
 
															-MASS-Base supports three types of inference backends:
														
 
															+MaaS-Base supports three types of inference backends:
														
 
															-- **Built-in**: Pre-configured backends (vLLM, MindIE, VoxBox, SGLang...) maintained by MASS-Base, automatically optimized for different hardware.
														
 
															+- **Built-in**: Pre-configured backends (vLLM, MindIE, VoxBox, SGLang...) maintained by MaaS-Base, automatically optimized for different hardware.
														
 
															 - **Community**: Pre-verified custom backend configurations. These are essentially CustomBackends labeled "community" to simplify manual setup.
														
 
															 - **Custom**: Backends you configure yourself with custom Docker images and commands.
														
--- a/docs/upgrade.md
+++ b/docs/upgrade.md
@@ -1,14 +1,14 @@
 
															 # Upgrade
														
 
															-You can upgrade MASS-Base by pulling and running a newer Docker image.
														
 
															+You can upgrade MaaS-Base by pulling and running a newer Docker image.
														
 
															-The following upgrade instructions apply only to MASS-Base v2.0 and later.
														
 
															+The following upgrade instructions apply only to MaaS-Base v2.0 and later.
														
 
															 For installations prior to v0.7, please refer to the [migration guide](migration.md).
														
 
															 !!! note
														
 
															-    1. When upgrading, upgrade the MASS-Base server first, then upgrade the workers.
														
 
															+    1. When upgrading, upgrade the MaaS-Base server first, then upgrade the workers.
														
 
															     2. Please **DO NOT** upgrade from/to the main(dev) version or a release candidate(rc) version, as they may contain breaking changes. Use a fresh installation if you want to try the main or rc versions.
														
@@ -16,7 +16,7 @@ For installations prior to v0.7, please refer to the [migration guide](migration
 
															     **Backup First:** Before proceeding with an upgrade, it’s strongly recommended to back up your database.
														
 
															-    For default installations, stop the MASS-Base server and create a backup of the PostgreSQL database directory located inside the container at:
														
 
															+    For default installations, stop the MaaS-Base server and create a backup of the PostgreSQL database directory located inside the container at:
														
 
															     ```
														
 
															     /var/lib/gpustack/postgresql/data
														
--- a/docs/user-guide/benchmarking.md
+++ b/docs/user-guide/benchmarking.md
@@ -1,6 +1,6 @@
 
															 # Benchmarking
														
 
															-MASS-Base can run benchmarks against running model instances. Benchmarks are executed by workers in a dedicated benchmark container image, with results and logs stored on the worker.
														
 
															+MaaS-Base can run benchmarks against running model instances. Benchmarks are executed by workers in a dedicated benchmark container image, with results and logs stored on the worker.
														
 
															 ## Prerequisites
														
--- a/docs/user-guide/built-in-inference-backends.md
+++ b/docs/user-guide/built-in-inference-backends.md
@@ -1,6 +1,6 @@
 
															 # Built-in Inference Backends
														
 
															-MASS-Base supports the following inference backends:
														
 
															+MaaS-Base supports the following inference backends:
														
 
															 - [vLLM](#vllm)
														
 
															 - [SGLang](#sglang)
														
@@ -31,13 +31,13 @@ vLLM seamlessly supports most state-of-the-art open-source models, including:
 
															 - Embedding Models (e.g. `Qwen3-Embedding`)
														
 
															 - Reranker Models (e.g. `Qwen3-Reranker`)
														
 
															-By default, MASS-Base estimates the VRAM requirement for the model instance based on the model's metadata.
														
 
															+By default, MaaS-Base estimates the VRAM requirement for the model instance based on the model's metadata.
														
 
															 You can customize the parameters to fit your needs. The following vLLM parameters might be useful:
														
 
															 - `--gpu-memory-utilization` (default: 0.9): The fraction of GPU memory to use for the model instance.
														
 
															-- `--max-model-len`: Model context length. For large-context models, MASS-Base automatically sets this parameter to `8192` to simplify model deployment, especially in resource constrained environments. You can customize this parameter to fit your needs.
														
 
															-- `--tensor-parallel-size`: Number of tensor parallel replicas. By default, MASS-Base sets this parameter given the GPU resources available and the estimation of the model's memory requirement. You can customize this parameter to fit your needs.
														
 
															+- `--max-model-len`: Model context length. For large-context models, MaaS-Base automatically sets this parameter to `8192` to simplify model deployment, especially in resource constrained environments. You can customize this parameter to fit your needs.
														
 
															+- `--tensor-parallel-size`: Number of tensor parallel replicas. By default, MaaS-Base sets this parameter given the GPU resources available and the estimation of the model's memory requirement. You can customize this parameter to fit your needs.
														
 
															 For more details, please refer to [vLLM CLI Reference](https://docs.vllm.ai/en/stable/cli/serve/).
														
@@ -56,11 +56,11 @@ Please refer to the vLLM [documentation](https://docs.vllm.ai/en/stable/models/s
 
															 - **Video Tasks**: Video generation and editing (e.g., `Wan2.2`)
														
 
															 - **Audio Tasks**: Speech synthesis, voice cloning, and more (e.g., `Qwen3-TTS`)
														
 
															-MASS-Base integrates with vLLM-Omni to deliver a seamless experience for deploying and managing omni-modal models. When a model is deployed via the vLLM backend, GPUStack automatically detects whether it is omni-modal based on its metadata and sets the required parameters for vLLM-Omni.
														
 
															+MaaS-Base integrates with vLLM-Omni to deliver a seamless experience for deploying and managing omni-modal models. When a model is deployed via the vLLM backend, GPUStack automatically detects whether it is omni-modal based on its metadata and sets the required parameters for vLLM-Omni.
														
 
															 #### Distributed Inference Across Workers (Experimental)
														
 
															-vLLM supports distributed inference across multiple workers using [Ray](https://ray.io). You can enable a Ray cluster in MASS-Base by checking the `Allow Distributed Inference Across Workers` option when deploying a model. This allows vLLM to run distributed inference across multiple workers.
														
 
															+vLLM supports distributed inference across multiple workers using [Ray](https://ray.io). You can enable a Ray cluster in MaaS-Base by checking the `Allow Distributed Inference Across Workers` option when deploying a model. This allows vLLM to run distributed inference across multiple workers.
														
 
															 !!! warning "Known Limitations"
														
@@ -86,15 +86,15 @@ See the full list of supported parameters for vLLM [here](https://docs.vllm.ai/e
 
															 It is designed to deliver low-latency and high-throughput inference across a wide range of setups, from a single GPU to large distributed clusters.
														
 
															-By default, MASS-Base estimates the VRAM requirement for the model instance based on model metadata.
														
 
															+By default, MaaS-Base estimates the VRAM requirement for the model instance based on model metadata.
														
 
															-When needed, MASS-Base also sets several parameters automatically for large-context models. Common SGLang parameters include:
														
 
															+When needed, MaaS-Base also sets several parameters automatically for large-context models. Common SGLang parameters include:
														
 
															 - `--mem-fraction-static` (default: `0.9`): The per-GPU allocatable VRAM fraction. The scheduler uses this value for resource matching and candidate selection. You can override it via the model's `backend_parameters`.
														
 
															-- `--context-length`: Model context length. For large-context models, if the automatically estimated context length exceeds device capability, MASS-Base sets this parameter to `8192` to simplify deployment in resource-constrained environments. You can customize this parameter as needed.
														
 
															-- `--tp-size`: Tensor parallel size. When not explicitly provided, MASS-Base infers and sets this parameter based on the selected GPUs.
														
 
															-- `--pp-size`: Pipeline parallel size. In multi-node deployments, MASS-Base determines a combination of `--tp-size` and `--pp-size` according to the model and cluster configuration.
														
 
															-- Multi-node arguments: `--nnodes`, `--node-rank`, `--dist-init-addr`. When distributed inference is enabled, MASS-Base injects these arguments to initialize multi-node communication.
														
 
															+- `--context-length`: Model context length. For large-context models, if the automatically estimated context length exceeds device capability, MaaS-Base sets this parameter to `8192` to simplify deployment in resource-constrained environments. You can customize this parameter as needed.
														
 
															+- `--tp-size`: Tensor parallel size. When not explicitly provided, MaaS-Base infers and sets this parameter based on the selected GPUs.
														
 
															+- `--pp-size`: Pipeline parallel size. In multi-node deployments, MaaS-Base determines a combination of `--tp-size` and `--pp-size` according to the model and cluster configuration.
														
 
															+- Multi-node arguments: `--nnodes`, `--node-rank`, `--dist-init-addr`. When distributed inference is enabled, MaaS-Base injects these arguments to initialize multi-node communication.
														
 
															 For more details, please refer to [SGLang documentation](https://docs.sglang.ai/index.html).
														
@@ -108,7 +108,7 @@ SGLang also supports image models. The ones we have verified include: Qwen-Image
 
															 #### Distributed Inference Across Workers (Experimental)
														
 
															-You can enable distributed SGLang inference across multiple workers in MASS-Base.
														
 
															+You can enable distributed SGLang inference across multiple workers in MaaS-Base.
														
 
															 !!! warning "Known Limitations"
														
@@ -151,7 +151,7 @@ See the full list of supported parameters for SGLang [here](https://docs.sglang.
 
															 MindIE supports various models listed [here](https://www.hiascend.com/software/mindie/modellist).
														
 
															-Within MASS-Base, support [large language models (LLMs)](https://www.hiascend.com/software/mindie/modellist) and [multimodal language models (VLMs)](https://www.hiascend.com/software/mindie/modellist).
														
 
															+Within MaaS-Base, support [large language models (LLMs)](https://www.hiascend.com/software/mindie/modellist) and [multimodal language models (VLMs)](https://www.hiascend.com/software/mindie/modellist).
														
 
															 However, _embedding models_ and _multimodal generation models_ are not supported yet.
														
@@ -159,7 +159,7 @@ However, _embedding models_ and _multimodal generation models_ are not supported
 
															 MindIE owns a variety of features outlined [here](https://www.hiascend.com/document/detail/zh/mindie/22RC1/mindiellm/llmdev/mindie_llm0001.html).
														
 
															-At present, MASS-Base supports a subset of these capabilities, including
														
 
															+At present, MaaS-Base supports a subset of these capabilities, including
														
 
															 [Quantization](https://www.hiascend.com/document/detail/zh/mindie/22RC1/mindiellm/llmdev/mindie_llm0279.html),
														
 
															 [Extending Context Size](https://www.hiascend.com/document/detail/zh/mindie/22RC1/mindiellm/llmdev/mindie_llm0295.html),
														
 
															 [Distributed Inference](https://www.hiascend.com/document/detail/zh/mindie/22RC1/mindiellm/llmdev/mindie_llm0296.html),
														
@@ -189,7 +189,7 @@ At present, MASS-Base supports a subset of these capabilities, including
 
															 MindIE has configurable [parameters](https://www.hiascend.com/document/detail/zh/mindie/22RC1/mindiellm/llmdev/mindie_service0285.html) and [environment variables](https://www.hiascend.com/document/detail/zh/mindie/22RC1/mindiellm/llmdev/mindie_llm0416.html).
														
 
															-To avoid directly configuring JSON, MASS-Base provides a set of command line parameters as below.
														
 
															+To avoid directly configuring JSON, MaaS-Base provides a set of command line parameters as below.
														
 
															 | Parameter                                            | Default | Range                    | Scope                                  | Description                                                                                                                                                                                                                                                                     |
														
 
															 |------------------------------------------------------|---------|--------------------------|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
														
@@ -253,9 +253,9 @@ To avoid directly configuring JSON, MASS-Base provides a set of command line par
 
															 !!! note
														
 
															-    MASS-Base allows users to inject custom environment variables during model deployment, however, some variables may be conflicted with MASS-Base managment.
														
 
															+    MaaS-Base allows users to inject custom environment variables during model deployment, however, some variables may be conflicted with MaaS-Base managment.
														
 
															-    Hence, MASS-Base will override/prevent those variables. Please compare the model instance logs' output with your expectations.
														
 
															+    Hence, MaaS-Base will override/prevent those variables. Please compare the model instance logs' output with your expectations.
														
 
															 ## VoxBox
														
--- a/docs/user-guide/cloud-credential-management.md
+++ b/docs/user-guide/cloud-credential-management.md
@@ -1,6 +1,6 @@
 
															 # Cloud Credential Management
														
 
															-MASS-Base supports cloud credential management, allowing secure connections to external cloud providers. Cloud credentials contain provider information, keys, and options required for API access.
														
 
															+MaaS-Base supports cloud credential management, allowing secure connections to external cloud providers. Cloud credentials contain provider information, keys, and options required for API access.
														
 
															 ## Supported Providers
														
--- a/docs/user-guide/cluster-management.md
+++ b/docs/user-guide/cluster-management.md
@@ -1,6 +1,6 @@
 
															 # Cluster Management
														
 
															-MASS-Base supports cluster-based worker management and provides multiple cluster types. You can provision a cluster through a `Cloud Provider` such as `DigitalOcean`, or create a self-hosted cluster and add workers using `Docker` run commands. Alternatively, you can register all nodes in a self-hosted `Kubernetes` cluster as MASS-Base workers.
														
 
															+MaaS-Base supports cluster-based worker management and provides multiple cluster types. You can provision a cluster through a `Cloud Provider` such as `DigitalOcean`, or create a self-hosted cluster and add workers using `Docker` run commands. Alternatively, you can register all nodes in a self-hosted `Kubernetes` cluster as MaaS-Base workers.
														
 
															 ## Create Cluster
														
@@ -47,7 +47,7 @@ The kubernetes can be registerred after the cluster is created.
 
															 ### Creating DigitalOcean Cluster
														
 
															-1. In the `Basic Configuration` step, the `Name` field is required and `Description` is optional. Create or select a Cloud Credential for communicating with the DigitalOcean API. Select a Region that supports GPU Droplets. You must also configure the `MASS-Base Server URL`, which will be accessible from the newly created DigitalOcean Droplets.
														
 
															+1. In the `Basic Configuration` step, the `Name` field is required and `Description` is optional. Create or select a Cloud Credential for communicating with the DigitalOcean API. Select a Region that supports GPU Droplets. You must also configure the `MaaS-Base Server URL`, which will be accessible from the newly created DigitalOcean Droplets.
														
 
															 2. Click `Next`.
														
 
															 3. Adding one or more `Worker Pools`. For each pool, `Name`, `Instance Type`, `OS Image`, `Replicas`, `Batch Size`, `Labels` and `Volumes` can be specified.
														
 
															 4. Click `Save` after the worker pools are configured.
														
@@ -120,4 +120,4 @@ huggingface_token: xxxxxx
 
															 enable_hf_transfer: false
														
 
															 ```
														
 
															-The above YAML lists all currently supported options for the `Worker Configuration YAML`. For the meaning of each option, refer to the full MASS-Base [config file documentation](../cli-reference/start.md#config-file).
														
 
															+The above YAML lists all currently supported options for the `Worker Configuration YAML`. For the meaning of each option, refer to the full MaaS-Base [config file documentation](../cli-reference/start.md#config-file).
														
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -1,9 +1,9 @@
 
															 # Project information
														
 
															-site_name: MASS-Base
														
 
															+site_name: MaaS-Base
														
 
															 site_url: https://docs.gpustack.ai
														
 
															-site_author: MASS-Base
														
 
															+site_author: MaaS-Base
														
 
															 site_description: >-
														
 
															-  MASS-Base is an open-source GPU cluster manager designed for efficient AI model deployment.
														
 
															+  MaaS-Base is an open-source GPU cluster manager designed for efficient AI model deployment.
														
 
															   It lets you run models efficiently on your own GPU hardware by choosing the best inference engines,
														
 
															   scheduling GPU resources, analyzing model architectures, and automatically configuring deployment parameters.