Qwen (Alibaba Cloud) analysis
Thesis
Qwen (Alibaba Cloud) is running one of the most prolific open-weight release cadences in the field, shipping a full ladder of dense and Mixture-of-Experts models — currently the Qwen3.5 and Qwen3.6 generations — across every modality and a parallel agentic coding stack (qwen-code, 25k stars). Adoption is enormous: its current flagship-tier checkpoints each pull millions of Hugging Face downloads in a 30-day window. The lab pairs frontier-scale MoE models (up to Qwen3.5-397B-A17B) with a dense small-model line tuned for production and mobile, and backs both with a steady stream of first-party research writing.
Shipping
Across modalities, the most-downloaded checkpoints in the context are the small dense Qwen3.5 instruct models: Qwen/Qwen3.5-4B at 9,934,423 30-day downloads (614 likes) and Qwen/Qwen3.5-9B at 9,277,612 (1,536 likes). The new Qwen3.6 generation is already pulling heavy traffic — Qwen/Qwen3.6-35B-A3B (MoE) at 5,852,936 downloads / 2,038 likes and Qwen/Qwen3.6-27B at 5,541,236 / 1,638 likes.
The MoE strategy spans sizes: the flagship Qwen3.5-397B-A17B (403B params, 17B active; 1,077,681 downloads, 1,504 likes), Qwen3.5-122B-A10B (815,955), and Qwen3.5-35B-A3B (2,754,795). A dense ladder fills out production and edge use: Qwen3.5-27B (2,857,230), Qwen3.5-2B (1,841,841), and Qwen3.5-0.8B (2,657,382). Matching -Base variants ship for most sizes (e.g. Qwen3.5-4B-Base, 205,712 downloads), confirming the standard base-plus-instruct release pattern.
On GitHub, the lab's top repos are QwenLM/Qwen3 (27,290 stars), QwenLM/qwen-code (25,009), QwenLM/Qwen (21,255), and the multimodal/coding lines QwenLM/Qwen3-VL (19,329), QwenLM/Qwen3-Coder (16,601), and QwenLM/Qwen-Agent (16,491). Speech and image are active too: Qwen3-TTS (11,800), Qwen-Image (7,977), and Qwen3-Omni (3,819). Release activity is concentrated in the qwen-code agentic CLI, which is on a near-daily nightly cadence — the latest tagged builds run from v0.17.1 through nightlies dated 20260604–20260608 — alongside supporting repos qwen-code-examples v0.1 and qwen-code-action v0.1.1.
Research themes
Qwen's first-party writing traces a consistent arc from early unified multimodal pretraining to today's reasoning and agentic systems:
- Generalist / unified multimodal models — the lab's roots: OFA: Towards Building a One-For-All Model, OFASys: Enabling Multitask Learning with One Line of Code, and Chinese CLIP.
- Foundation-model generations — Introducing Qwen, Introducing Qwen1.5, Hello Qwen2, and Qwen2.5: A Party of Foundation Models.
- Mixture-of-Experts efficiency — Qwen1.5-MoE: Matching 7B Model Performance with 1/3 Activated Parameters, the through-line behind today's A3B/A10B/A17B releases.
- Long context — Generalizing an LLM from 8k to 1M Context using Qwen-Agent and Extending the Context Length to 1M Tokens!.
- Math & reasoning — Introducing Qwen2-Math, Qwen2.5-Math, Towards Effective Process Supervision in Mathematical Reasoning, and the reasoning-first QwQ: Reflect Deeply on the Boundaries of the Unknown plus its visual counterpart QVQ: To See the World with Wisdom.
- Code — Code with CodeQwen1.5 and Qwen2.5-Coder: Code More, Learn More!, where the Coder family post positions Qwen2.5-Coder-32B-Instruct as a SOTA open code model "matching the coding capabilities of GPT-4o."
- Audio & vision modalities — Qwen-VL, Qwen2-VL: To See the World More Clearly, and Qwen2-Audio: Chat with Your Voice!.
Hiring & scaling
The captured roles are all on the 通义大模型事业部 (Tongyi large-model division), based in 杭州 (Hangzhou): algorithm engineer (算法工程师), R&D engineer (研发工程师), and their senior counterparts (高级算法工程师 / 高级研发工程师). The split between algorithm and engineering tracks — each at both standard and senior levels — signals continued investment in both core model research and the production/infra stack behind it, concentrated in a single Hangzhou hub rather than distributed teams.
Traction highlights
- Hacker News: the standout thread is QwenLM/Qwen3-Omni at 571 points / 142 comments, far ahead of the next — QwenLM/Qwen (36 points, 51 comments) and QwenLM/Qwen3-VL-Embedding (11 points). Multimodal/omni work draws the strongest external attention.
- Most-starred repos: QwenLM/Qwen3 (27,290) and QwenLM/qwen-code (25,009) lead, with Qwen3-VL (19,329) and Qwen3-Coder (16,601) close behind.
- Most-downloaded models: Qwen3.5-4B (9.93M) and Qwen3.5-9B (9.28M) dominate 30-day downloads — small dense models carry the bulk of real-world usage, while the 397B-A17B flagship still clears 1M.