TurboQuant model weight compression support added to Llamacp / TurboQuant模型权重压缩支持已添加到Llamacpp
📰 2026-04-04 18:30 更新 🔸 TurboQuant model weight compression support added to Llamacpp / TurboQuant模型权重压缩支持已添加到Llamacpp 🔗 TurboQuant model weight compression support added to Llamacpp 🔥 10 points 原文: Adds CUDA dequantization for TQ4_1S (5.0 bpv) and TQ3_1S (4.0 bpv) WHT-rotated weight compression types. These achieve 27-37% model size reduction at +1.0-1.9% PPL on Qwen/Phi families. … Read more