Paper Highlights: Grokking Structure with Transformers

Name: Paper Highlights: Grokking Structure with Transformers
Uploaded: 2024-02-10T17:39:49+00:00
Channel: Pister Labs
Description: Reading Structural Grokking in Vanilla Transformers by Hoogland et al. This paper challenges the concept that your validation accuracy is what determine...

Pister Labs · Advanced ·📄 Research Papers Explained ·2y ago

Reading Structural Grokking in Vanilla Transformers by Hoogland et al. This paper challenges the concept that your validation accuracy is what determines when you should stop training your transformer models. https://arxiv.org/abs/2305.18741

Watch on YouTube ↗ (saves to browser)

Next Up

Lecture 23: The Qing through Qianlong

MIT OpenCourseWare

Paper Highlights: Grokking Structure with Transformers

Lesson complete!