Spectral Entropy Collapse as an Empirical Signature of Delayed Generalisation in Grokking

📰 ArXiv cs.AI

arXiv:2604.13123v1 Announce Type: cross Abstract: Grokking -- delayed generalisation long after memorisation -- lacks a predictive mechanistic explanation. We identify the normalised spectral entropy $\tilde{H}(t)$ of the representation covariance as a scalar order parameter for this transition, validated on 1-layer Transformers on group-theoretic tasks. Five contributions: (i) Grokking follows a two-phase pattern: norm expansion then entropy collapse. (ii) $\tilde{H}$ crosses a stable threshold

Published 16 Apr 2026
Read full paper → ← Back to Reads