AMD Announces Day-0 Support for Qwen 3.5 Across Instinct MI300X and MI325X GPUs

Sarah Thompson February 17, 2026Tech

AMD has announced Day-0 support for Alibaba Qwen Qwen 3.5, its latest open-weight large language model, across AMD Instinct MI300X, MI325X, and MI35X GPUs. The enablement arrives in close collaboration with the Qwen team and ships fully optimized through the ROCm software stack, allowing developers to deploy the model immediately at production scale.

AMD Announces Day-0 Support for Qwen 3.5 Across Instinct MI300X and MI325X GPUs

Designed for Long-Context AI and Enterprise-Scale Workloads

Qwen 3.5 targets long-context reasoning and multimodal workflows, supporting context windows up to 256K tokens. To avoid the quadratic scaling limits of traditional Transformers, the model introduces a Hybrid Attention design that alternates full multi-head attention with linear attention layers. This approach preserves recall while reducing compute overhead as sequences grow.

Ultra-Sparse MoE Design Reduces Compute Overhead

Qwen 3.5 advances Mixture-of-Experts with a Shared Expert path that processes every token for stability, alongside Top-K routed experts that activate only a subset of specialists during inference. This ultra-sparse design delivers dense-model-level quality while using far less compute—an efficient match for Instinct GPUs in cost-sensitive enterprise deployments.

Native Multimodal AI Capabilities for Visual Workflows

The model is multimodal by design, integrating a DeepStack Vision Transformer and 3D convolutions to treat video as a temporal dimension. By merging features from multiple visual encoder layers, Qwen 3.5 captures both fine detail and high-level context. These capabilities enable “visual agent” use cases such as object identification in complex environments.

Optimized out of the box with SGLang and vLLM

AMD delivers Day-0 performance through SGLang and vLLM. Linear attention runs via Triton-based kernels on ROCm, Shared Expert paths leverage optimized hipBLASLt GEMMs, and vision components rely on standard MIOpen and PyTorch kernels. Large HBM capacity on MI300X/MI325X/MI35X lets teams host full-scale models and massive contexts on a single GPU or node, simplifying deployment.

With Day-0 support, AMD positions Qwen 3.5 as a production-ready, open-weight alternative optimized for its data-center accelerators. The move strengthens AMD’s AI stack for developers building long-context reasoning systems, multimodal agents, and enterprise platforms—without forcing trade-offs between scale, speed, and cost.

Last updated on February 17, 2026

Comments

No comments yet. Why don’t you start the discussion?

AMD Announces Day-0 Support for Qwen 3.5 Across Instinct MI300X and MI325X GPUs

Designed for Long-Context AI and Enterprise-Scale Workloads

Ultra-Sparse MoE Design Reduces Compute Overhead

Native Multimodal AI Capabilities for Visual Workflows

Optimized out of the box with SGLang and vLLM

Read More

Comments

Leave a Reply Cancel reply

Designed for Long-Context AI and Enterprise-Scale Workloads

Ultra-Sparse MoE Design Reduces Compute Overhead

Native Multimodal AI Capabilities for Visual Workflows

Optimized out of the box with SGLang and vLLM

Share this:

Read More

Comments

Leave a Reply Cancel reply