RAGFlow can integrate with MaaS-Base to leverage locally deployed LLMs, embeddings, reranking, Speech-to-Text and Text-to-Speech capabilities.
Deployments page and click on Deploy Model to deploy the models you need. Here are some example models:API Access Info to see how to integrate with this model.Hover over the user avatar and navigate to the API Keys page, then click on New API Key.
Fill in the name, then click Save.
Copy the API key and save it for later use.
Model Providers > GPUStack, then select Add the model and fill in:Model type: Select the model type based on the model.
Model name: The name must match the model name deployed on MaaS-Base.
Base URL: http://your-gpustack-url/v1, the URL should not include the path and do not use localhost, as it refers to the container’s internal network. If you’re using a custom port, make sure to include it. Also, ensure the URL is accessible from inside the RAGFlow container (you can test this with curl).
API-Key: Input the API key you copied from previous steps.
Max Tokens: Input the max tokens supported by current model configuration.
Click OK to add the model:
Set default models and save:You can now use the models in the Chat and Knowledge Base, here is a simple case:
Knowledge base to create a new knowledge base and add your file:Retrieval testing and set the rerank model to bge-reranker-v2-m3:Chat, create an assistant, link the previously created knowledge base, and select a chat model:qwen2.5-vl-3b-instruct. After saving, create a new chat and upload an image to enable multimodal input: