Training Transformers in Cosine Coefficient Space

📰 ArXiv cs.AI

Training transformers in cosine coefficient space improves performance by parameterizing weight matrices in the DCT domain

advanced Published 7 Apr 2026
Action Steps
  1. Parameterize weight matrices of a transformer in the 2D discrete cosine transform (DCT) domain
  2. Retain only the lowest-frequency coefficients to reduce dimensionality
  3. Reconstruct the full weight matrix via the inverse DCT at each forward pass
  4. Update the spectral coefficients directly through backpropagation
Who Needs to Know This

ML researchers and engineers working on transformer models can benefit from this approach to improve model performance and efficiency, and it can be applied to various NLP tasks

Key Insight

💡 Parameterizing weight matrices in the DCT domain can improve transformer model performance

Share This
💡 Train transformers in cosine coefficient space for improved performance
Read full paper → ← Back to News