Add MiniMax as alternative LLM provider (M3 default) by octo-patch · Pull Request #428 · NVIDIA/GenerativeAIExamples

octo-patch · 2026-03-22T12:15:22Z

Summary

Add MiniMax as an alternative LLM provider in the RAG chain server get_llm() factory
MiniMax provides an OpenAI-compatible API. This PR makes MiniMax-M3 (512K context window, 128K max output, supports image input) the default, while keeping MiniMax-M2.7 and MiniMax-M2.7-highspeed as alternatives
Uses ChatOpenAI from langchain-openai with the MiniMax base URL — no new proprietary SDK required
Temperature clamping to [0, 1] range for MiniMax API compatibility

Changes

File	Change
`RAG/src/chain_server/utils.py`	Add `minimax` branch in `get_llm()` factory using `ChatOpenAI`, default to `MiniMax-M3`
`RAG/src/chain_server/configuration.py`	Update `model_engine` help text to list `minimax`
`RAG/src/chain_server/requirements.txt`	Add `langchain-openai>=0.0.6` dependency
`docs/change-model.md`	Add MiniMax usage documentation with env vars (M3 default, M2.7 / M2.7-highspeed alternatives)
`RAG/src/chain_server/tests/`	Add 15 unit tests + 3 integration tests, all asserting M3 as the default

Available models

MiniMax-M3 — Latest flagship model, 512K context window, 128K max output, supports image input (default)
MiniMax-M2.7 — Previous generation flagship
MiniMax-M2.7-highspeed — Previous generation, optimized for speed

Usage

APP_LLM_MODELENGINE='minimax' \
APP_LLM_MODELNAME='MiniMax-M3' \
MINIMAX_API_KEY='your-minimax-api-key' \
docker compose up -d --build

Test plan

15 unit tests covering MiniMax provider creation, temperature clamping, parameter forwarding, config defaults, error handling, and M3 default selection
2 unit tests verifying NVIDIA AI endpoints path is unchanged
3 integration tests verifying real API calls (M3 chat completion, M3 streaming, M2.7-highspeed)
Docker compose integration test with APP_LLM_MODELENGINE=minimax

Add MiniMax Cloud API (https://api.minimax.io/v1) as an alternative LLM provider alongside NVIDIA AI endpoints. MiniMax offers an OpenAI-compatible API with models including MiniMax-M2.7 (1M context) and MiniMax-M2.5-highspeed (204K context, speed-optimized). Changes: - Add 'minimax' model_engine branch in get_llm() factory (utils.py) - Use ChatOpenAI from langchain-openai with MiniMax base_url - Temperature clamping to [0, 1] range for MiniMax API compatibility - Auto-detect MINIMAX_API_KEY environment variable - Add langchain-openai dependency to requirements.txt - Update LLMConfig help text to mention minimax - Add MiniMax usage documentation in docs/change-model.md - Add 15 unit tests and 3 integration tests Co-Authored-By: Octopus <liyuan851277048@icloud.com>

- Set MiniMax-M3 as the default model in get_llm() minimax branch - Update docs to list M3 (512K context, 128K max output) as the recommended option and keep M2.7 / M2.7-highspeed as alternatives - Drop older M2.5 / M2.5-highspeed references from docs and tests - Update unit + integration tests to assert M3 default and exercise M2.7-highspeed as the kept legacy model

PR Bot and others added 2 commits March 22, 2026 20:14

octo-patch changed the title ~~Add MiniMax as alternative LLM provider~~ Add MiniMax as alternative LLM provider (M3 default) Jun 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MiniMax as alternative LLM provider (M3 default)#428

Add MiniMax as alternative LLM provider (M3 default)#428
octo-patch wants to merge 2 commits into
NVIDIA:mainfrom
octo-patch:feature/add-minimax-provider

octo-patch commented Mar 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

octo-patch commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Available models

Usage

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

octo-patch commented Mar 22, 2026 •

edited

Loading