此页是 2026-06-01 的观测快照,查看该模型当前信息 → /m/w-ahmad__qwen35-9b-gguf-moq-mtp/

归档 / 2026-06-01 / Qwen3.5-9B (Qwen)

Qwen3.5-9B (Qwen)

Qwen3.5-9B MoQ量化版，MTP推测解码，适合本地快速生成

入选理由: 基于Qwen3.5-9B的MoQ量化模型，有现成GGUF文件可直接用llama.cpp推理，第三方评测显示性能优于同类量化。
对位: 替代 Unsloth Dynamic 等均匀量化，同体积质量更高
适合: 本地内存受限下的9B模型推理 / 利用MTP推测解码加速文本生成
不适合: 追求极致精度或官方全量模型的任务
规模: 9B · 32k
授权: MIT · 可商用
框架: llama.cpp / llama-cpp-python
血统: 量化自 Qwen3.5-9B
可信度: 6k下载，基于Qwen3.5-9B，MoQ量化在WikiText上优于UnslothDynamic约10%

社区实测

社区口碑两极分化明显：在 agentic coding 和本地小任务上表现出超预期的性价比与速度，benchmark 成绩亮眼；但在工具调用格式遵循、多语言逻辑一致性、事实准确性方面翻车案例不少，实际体验与 benchmark 排名存在落差。

搭配 Kilo Code、Roo Code、Continue 做 agentic coding 体验良好
代码补全质量优于更小的 Qwen2.5 Coder 模型
在部分用户实测中产出可直接交付的代码，整体优于 Qwen 2.5 Coder 32B
可在 8GB 内存、2019 年笔记本等 modest 硬件上本地运行
适合本地工具调用与信息抽取等嵌入式应用
适合邮件分类等小型实用任务
在 GPQA Diamond、MMLU-Pro、MMMLU 等 benchmark 上超越数倍参数量的更大模型
在 5/8 项共享 benchmark 上击败 Gemma-4-12B，参数量更小

对自定义工具调用格式（如 <call>...</call>）遵循能力差
部分 prompt 字符串会破坏输出逻辑
在多语言交流中逻辑有问题，英语场景也偶有出现
事实性任务中存在明显幻觉
benchmark 高分与实际使用可靠性之间存在落差
不能替代 Opus 等大模型用于专业日常工作
纯编码能力上 Gemma-4-12B 可能略优

来源

Qwen3.5-9B is actually quite good for agentic coding - Reddit Qwen3.5-9B Surprised Me - Faster and More Reliable Than ... - Reddit Qwen 3.5 9B Low Quality Performance : r/ollama - Reddit Is Qwen3.5-9B enough for Agentic Coding? : r/LocalLLaMA - Reddit Just want to echo the recommendation for qwen3.5:9b. This is a ...I have spent a HUGE amount of time the last two years ...Qwen 2.5 -> 3 -> 3.5, smallest models. Incredible improvement over ...Qwen3.5-9B tops every AI benchmark right now, but that's not how ...gemma-4-12b-it vs Qwen3.5-9B on shared benchmarks - Reddit qwen 3.5-9b beats bigger models in reasoning - Facebook A 9B Model Just Beat a 120B One. Here's What Nobody's Telling You.

截至 2026-06-21

快速上手

ollama run hf.co/w-ahmad/Qwen3.5-9B-GGUF-MoQ-MTP

评分详情

Q1: 今天能接上用吗 5 / 5
Q2: 有可信证据吗 3 / 5
Q3: 是新东西吗 3 / 5
总分: 11

HuggingFace 原始数据 (抓取于 2026-06-01)

作者: w-ahmad
任务类型: text-generation
推理库: gguf
下载: 6,044
点赞: 0
许可证: MIT
标签: gguf, MoQ, mixture-of-quants, GGUF, QWEN, quantization, text-generation, en, base_model:Qwen/Qwen3.5-9B, base_model:quantized:Qwen/Qwen3.5-9B, license:mit, endpoints_compatible, region:us, conversational

探索