-
Notifications
You must be signed in to change notification settings - Fork 189
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
dsv4-fp4-gb300-dynamo-trt: STP + MTP disagg trtllm recipes on GB300
full-sweep-enabled
#1689
opened Jun 8, 2026 by
Ankur-singh
Collaborator
Loading…
dsr1-fp4-b200-dynamo-sglang-mtp: 8k1k 6-variant MTP sweep on local split recipes
full-sweep-enabled
#1688
opened Jun 8, 2026 by
Ankur-singh
Collaborator
Loading…
[NV] DeepSeek-V4-Pro trtllm disagg receipes for STP and MTP
#1687
opened Jun 8, 2026 by
richardhuo-nv
Loading…
[DO NOT MERGE] Run-only: gb300 dsr1 measured power+temp validation
sweep-enabled
#1686
opened Jun 8, 2026 by
arygupt
Collaborator
Loading…
[NV] Speedbench-al: fix --chat-template-kwargs default quoting so thinking-on cells run
#1685
opened Jun 8, 2026 by
qiching
Collaborator
Loading…
[AMD] dsv4-fp4-mi355x-atom-disagg, add multi-node ATOM/mooncake disaggregation support
AMD
#1683
opened Jun 8, 2026 by
seungrokj
Collaborator
Loading…
5 tasks
dsv4-fp4-b300-sglang: align env vars to GB300
full-sweep-enabled
#1682
opened Jun 8, 2026 by
yhyang201
Collaborator
Loading…
[AMD][MI35X] Qwen3.5-fp4 SGLang single-node benchmark
AMD
full-sweep-enabled
#1680
opened Jun 8, 2026 by
1am9trash
Collaborator
Loading…
Bump actions/checkout from 6.0.2 to 6.0.3 in the github-actions group
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#1679
opened Jun 8, 2026 by
dependabot
Bot
Loading…
Add DSv4-Pro FP4 GB200 SGLang disagg + MTP config
full-sweep-enabled
#1676
opened Jun 5, 2026 by
Ankur-singh
Collaborator
Loading…
Add DSv4-Pro FP4 GB200 SGLang disagg config
full-sweep-enabled
#1675
opened Jun 5, 2026 by
Ankur-singh
Collaborator
Loading…
[AMD][MI355X] Bump qwen3.5-bf16 single-node SGLang image to v0.5.12.post1
#1673
opened Jun 5, 2026 by
ChangLiu0709
Collaborator
Loading…
2 of 3 tasks
chore(sweep): re-run MiniMax-M2.5 vLLM sweeps for motniroing
full-sweep-enabled
#1666
opened Jun 4, 2026 by
arygupt
Collaborator
Loading…
[WIP] Initial work to add llm-d-vllm framework with H200
#1660
opened Jun 4, 2026 by
ezrasilvera
Collaborator
Loading…
Throwaway: conc-64 gsm8k eval for DEP8+MTP3 dispatch token bug
non-canary-full-sweep-enabled
Run the full sweep without the canary gate (full search space, no trim)
#1659
opened Jun 3, 2026 by
Oseltamivir
Collaborator
Loading…
[NV] Add Kimi K2.5 FP4 B200/B300 EP sweep
full-sweep-enabled
#1658
opened Jun 3, 2026 by
jasonlizhengjian
Collaborator
•
Draft
AMD - gpt-oss vllm mxfp4: AITER tuning + n-gram spec decode + server …
AMD
#1657
opened Jun 3, 2026 by
nehaprakriya
Collaborator
•
Draft
[WIP] Update Dsv4 B300 configs
full-sweep-enabled
#1656
opened Jun 3, 2026 by
wzhao18
Collaborator
Loading…
fix(power): classify zero-decode-GPU multinode runs as aggregated
#1646
opened Jun 2, 2026 by
arygupt
Collaborator
Loading…
dsr1-fp4-mi355x-sglang: bump ROCm 7.0->7.2 image + add TP4 search-space
AMD
#1645
opened Jun 2, 2026 by
JohnQinAMD
Collaborator
Loading…
Use official TRT-LLM image (1.3.0rc15.post1) for DSv4 B300 TRT (non-MTP + MTP)
full-sweep-enabled
#1636
opened Jun 1, 2026 by
Oseltamivir
Collaborator
Loading…
feat(power): vendor-agnostic GPU power/telemetry aggregation core
#1635
opened Jun 1, 2026 by
arygupt
Collaborator
Loading…
2 of 3 tasks
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.