Skip to content

Add MiniMax as alternative LLM provider (M3 default)#428

Open
octo-patch wants to merge 2 commits into
NVIDIA:mainfrom
octo-patch:feature/add-minimax-provider
Open

Add MiniMax as alternative LLM provider (M3 default)#428
octo-patch wants to merge 2 commits into
NVIDIA:mainfrom
octo-patch:feature/add-minimax-provider

Conversation

@octo-patch
Copy link
Copy Markdown

@octo-patch octo-patch commented Mar 22, 2026

Summary

  • Add MiniMax as an alternative LLM provider in the RAG chain server get_llm() factory
  • MiniMax provides an OpenAI-compatible API. This PR makes MiniMax-M3 (512K context window, 128K max output, supports image input) the default, while keeping MiniMax-M2.7 and MiniMax-M2.7-highspeed as alternatives
  • Uses ChatOpenAI from langchain-openai with the MiniMax base URL — no new proprietary SDK required
  • Temperature clamping to [0, 1] range for MiniMax API compatibility

Changes

File Change
RAG/src/chain_server/utils.py Add minimax branch in get_llm() factory using ChatOpenAI, default to MiniMax-M3
RAG/src/chain_server/configuration.py Update model_engine help text to list minimax
RAG/src/chain_server/requirements.txt Add langchain-openai>=0.0.6 dependency
docs/change-model.md Add MiniMax usage documentation with env vars (M3 default, M2.7 / M2.7-highspeed alternatives)
RAG/src/chain_server/tests/ Add 15 unit tests + 3 integration tests, all asserting M3 as the default

Available models

  • MiniMax-M3 — Latest flagship model, 512K context window, 128K max output, supports image input (default)
  • MiniMax-M2.7 — Previous generation flagship
  • MiniMax-M2.7-highspeed — Previous generation, optimized for speed

Usage

APP_LLM_MODELENGINE='minimax' \
APP_LLM_MODELNAME='MiniMax-M3' \
MINIMAX_API_KEY='your-minimax-api-key' \
docker compose up -d --build

Test plan

  • 15 unit tests covering MiniMax provider creation, temperature clamping, parameter forwarding, config defaults, error handling, and M3 default selection
  • 2 unit tests verifying NVIDIA AI endpoints path is unchanged
  • 3 integration tests verifying real API calls (M3 chat completion, M3 streaming, M2.7-highspeed)
  • Docker compose integration test with APP_LLM_MODELENGINE=minimax

PR Bot and others added 2 commits March 22, 2026 20:14
Add MiniMax Cloud API (https://api.minimax.io/v1) as an alternative LLM
provider alongside NVIDIA AI endpoints. MiniMax offers an OpenAI-compatible
API with models including MiniMax-M2.7 (1M context) and MiniMax-M2.5-highspeed
(204K context, speed-optimized).

Changes:
- Add 'minimax' model_engine branch in get_llm() factory (utils.py)
- Use ChatOpenAI from langchain-openai with MiniMax base_url
- Temperature clamping to [0, 1] range for MiniMax API compatibility
- Auto-detect MINIMAX_API_KEY environment variable
- Add langchain-openai dependency to requirements.txt
- Update LLMConfig help text to mention minimax
- Add MiniMax usage documentation in docs/change-model.md
- Add 15 unit tests and 3 integration tests

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
- Set MiniMax-M3 as the default model in get_llm() minimax branch
- Update docs to list M3 (512K context, 128K max output) as the
  recommended option and keep M2.7 / M2.7-highspeed as alternatives
- Drop older M2.5 / M2.5-highspeed references from docs and tests
- Update unit + integration tests to assert M3 default and exercise
  M2.7-highspeed as the kept legacy model
@octo-patch octo-patch changed the title Add MiniMax as alternative LLM provider Add MiniMax as alternative LLM provider (M3 default) Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant