How I Got 2nd-Order LLM Convergence for Free: The SCAO Optimizer
📰 Medium · Deep Learning
If you are fine-tuning Large Language Models (LLMs) today, you are probably using AdamW. It is the undisputed king of optimizers: stable… Continue reading on Medium »
DeepCamp AI