-
Notifications
You must be signed in to change notification settings - Fork 219
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Klaud Cold] DSV4 MI355X vLLM disagg smoke test (8k1k conc=32) / DSV4 MI355X vLLM 分离式冒烟测试(8k1k conc=32)
full-sweep-fail-fast
#2081
opened Jul 4, 2026 by
functionstackx
Collaborator
Loading…
[Klaud Cold] Update kimik2.5-int4-mi355x-vllm vLLM ROCm image to v0.24.0 / 将 kimik2.5-int4-mi355x-vllm 的 vLLM ROCm 镜像 升级至 v0.24.0
full-sweep-fail-fast
#2077
opened Jul 4, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] Update kimik2.5-fp4-mi355x-vllm vLLM ROCm image to v0.24.0 / 将 kimik2.5-fp4-mi355x-vllm 的 vLLM ROCm 镜像 升级至 v0.24.0
full-sweep-fail-fast
#2074
opened Jul 4, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] Update kimik2.5-int4-mi325x-vllm vLLM ROCm image to v0.24.0 / 将 kimik2.5-int4-mi325x-vllm 的 vLLM ROCm 镜像 升级至 v0.24.0
full-sweep-fail-fast
#2071
opened Jul 4, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] Update qwen3.5-fp4-mi355x-sglang (+mtp) SGLang ROCm image to v0.5.14-rocm720-mi35x / 将 qwen3.5-fp4-mi355x-sglang(+mtp) 的 SGLang ROCm 镜像 升级至 v0.5.14-rocm720-mi35x
full-sweep-fail-fast
#2068
opened Jul 4, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] Update qwen3.5-fp8-mi355x-sglang (+mtp) SGLang ROCm image to v0.5.14-rocm720-mi35x / 将 qwen3.5-fp8-mi355x-sglang(+mtp) 的 SGLang ROCm 镜像 升级至 v0.5.14-rocm720-mi35x
full-sweep-fail-fast
#2067
opened Jul 4, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] Update qwen3.5-bf16-mi355x-sglang (+mtp) SGLang ROCm image to v0.5.14-rocm720-mi35x / 将 qwen3.5-bf16-mi355x-sglang(+mtp) 的 SGLang ROCm 镜像 升级至 v0.5.14-rocm720-mi35x
full-sweep-fail-fast
#2066
opened Jul 4, 2026 by
functionstackx
Collaborator
Loading…
1 task
[Klaud Cold] [AMD] gpt-oss-fp4-mi355x (vllm): W4A8 moe optimizations and vllm image bump / gpt-oss-fp4-mi355x(vLLM):W4A8 MoE 优化与 vLLM 镜像升级
full-sweep-fail-fast
#2051
opened Jul 4, 2026 by
xiaohuguo2023
Collaborator
Loading…
[NV] llm-d-vllm: Add llm-d to the InferenceX benchmarking framework
full-sweep-enabled
#2050
opened Jul 4, 2026 by
ezrasilvera
Collaborator
Loading…
3 tasks
Update DSV4 GB300 Dynamo vLLM Recipes
full-sweep-fail-fast
#2010
opened Jul 3, 2026 by
hjjq
Collaborator
Loading…
[WIP] Update Minimax M3 FP4 B200 Eagle
full-sweep-enabled
#2007
opened Jul 3, 2026 by
wzhao18
Collaborator
Loading…
Update Minimax M3 FP4 B300 Eagle
full-sweep-enabled
#2006
opened Jul 3, 2026 by
wzhao18
Collaborator
Loading…
[AMD] MiniMax-M3 MXFP8 MI355X vLLM: nightly + AITER-on TP4 + emulatin linear / MiniMax-M3 MXFP8 MI355X vLLM:升级 nightly + 启用 AITER TP4 + emulation linear
full-sweep-enabled
#2003
opened Jul 3, 2026 by
hongxiayang
Collaborator
Loading…
[AMD] MiniMax-M3 FP4/FP8 MI355X ATOMESH (disagg): refactor config & add MTP recipes / 重构配置并新增 MTP 配方 / 설정 리팩토링 및 MTP 레시피 추가
AMD
evals-only
Suppress throughput and run only eval jobs; combine with all-evals to expand selection
full-sweep-enabled
#2000
opened Jul 3, 2026 by
seungrokj
Collaborator
Loading…
8 tasks
[WIP] Test Kimi 2.5 B300 Agg
full-sweep-enabled
#1998
opened Jul 3, 2026 by
wzhao18
Collaborator
Loading…
[DNM][AMD] agentX benchmark (v1.0) / agentX 基准测试 (v1.0) / agentX 벤치마크 (v1.0)
#1996
opened Jul 3, 2026 by
seungrokj
Collaborator
Loading…
Update Minimax M3 B300 FP4 vllm
full-sweep-enabled
#1994
opened Jul 2, 2026 by
wzhao18
Collaborator
Loading…
[NV] perf: update MiniMax-M3 FP4 B300 vLLM MTP
full-sweep-fail-fast
#1991
opened Jul 2, 2026 by
anish-shanbhag
Collaborator
Loading…
[WIP] [do not merge] Add MiniMax-M3 FP4 B200 Dynamo-vLLM disagg config
full-sweep-fail-fast-no-canary
Full sweep, no canary gate; first failure in a matrix cancels that matrix
#1982
opened Jul 2, 2026 by
jasonlizhengjian
Collaborator
Loading…
test the GB300 cluster after the node patch
full-sweep-enabled
#1961
opened Jun 30, 2026 by
richardhuo-nv
Collaborator
Loading…
Update Qwen3.5 FP4 MI355X MTP recipe with tuned env/flags / 使用调优的环境变量和参数更新 Qwen3.5 FP4 MI355X MTP 配方
#1957
opened Jun 29, 2026 by
amd-fuyuajin
Collaborator
Loading…
[AMD] Tune MiniMax-M3 MXFP8 MI300X vLLM: async scheduling + big-prefill, fix conc256 EP8→EP1
full-sweep-enabled
#1951
opened Jun 29, 2026 by
ZhengGong-amd
Collaborator
Loading…
7 of 8 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.