NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute / NanoGPT运行缓慢：使用无限计算，数据效率提高10倍

📰 2026-03-20 04:00 更新

🔸 NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute / NanoGPT运行缓慢：使用无限计算，数据效率提高10倍

🔗 NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute
🔥 27 points

原文:
We’ve achieved 10x data efficiency with NanoGPT Slowrun within a few weeks. An ensemble of 1.8B parameter models (18B total params) trained on 100M tokens matches what would normally require 1B tokens with a standard LM baseline. Data efficiency matters because compute grows much faster than data . Since our current scaling laws require proportional increases in both , intelligence will eventually be bottlenecked by data, not compute. This data efficiency result allows us to improve model pe…

译文:
我们在几周内使用NanoGPT Slowrun实现了10倍的数据效率。在100M代币上训练的1.8B参数模型（总参数为18B ）的集合通常需要10B代币与标准LM基线相匹配。数据效率很重要，因为计算的增长速度比数据快得多。由于我们目前的缩放定律要求两者都按比例增加，因此智能最终将受到数据而不是计算的瓶颈。这些数据效率结果使我们能够改进模型PE…

自动更新 · 正文抓取 · 双语翻译

📰 2026-03-20 04:00 更新

🔸 NanoGPT Slowrun: 10x Data Efficiency with Infinite Compute / NanoGPT运行缓慢：使用无限计算，数据效率提高10倍

Leave a Comment Cancel reply