Grokking as Dimensional Phase Transition in Neural Networks
📰 ArXiv cs.AI
Grokking in neural networks is a dimensional phase transition where effective dimensionality changes from sub-diffusive to diffusive at generalization onset
Action Steps
- Analyze gradient avalanche dynamics across multiple model scales to identify the phase transition
- Apply finite-size scaling to understand the dimensional phase transition
- Investigate the relationship between effective dimensionality and generalization onset
- Develop new training methods that take into account the dimensional phase transition
Who Needs to Know This
ML researchers and AI engineers benefit from understanding grokking as a dimensional phase transition to improve neural network training and generalization, and to develop more efficient learning algorithms
Key Insight
💡 Grokking is a dimensional phase transition where effective dimensionality crosses from sub-diffusive to diffusive at generalization onset
DeepCamp AI