GPU Efficiencyarchive | Efficient Coder

How DeepSeek’s Engram Makes LLMs Cheaper & Smarter: The N-gram Lookup Table Breakthrough

3 months ago 高效码农

Offload Memorization to a Lookup Table, Let the GPU Reason: How DeepSeek’s Engram Makes LLMs Both Cheaper and Smarter ❝ 「Bottom line up front」 Transformers burn layers reconstructing static facts that could be retrieved in one hop. Engram adds an O(1) N-gram lookup table beside the MoE experts, keeps the same parameter and FLOP budget, and immediately gains 3–5 pts on knowledge, reasoning, code and long-context benchmarks. ❞ What this article will answer What exactly is Engram and is it a friend or foe to MoE? Why does a simple lookup table boost MMLU, BBH, HumanEval and even 32 k-needle …