What is Cosine Annealing?

A learning rate schedule strategy that gently reduces the learning rate from a maximum value to near zero following a cosine curve. Cosine annealing reduces LR more gently than step decay and enables late fine-tuning with very small rates. Warm restarts periodically reset the LR.

What is the difference between Cosine Annealing and Learning Rate Schedule?

Cosine Annealing and Learning Rate Schedule are related concepts in AI and marketing. A learning rate schedule strategy that gently reduces the learning rate from a maximum value to near...

Artificial Intelligence

Cosine Annealing

Also known as:

Cosine Decay

Cosine Schedule

SGDR

Updated: 2/10/2026

A learning rate schedule strategy that gently reduces the learning rate from a maximum value to near zero following a cosine curve.

Quick Summary

Cosine annealing lowers the learning rate in a cosine curve – standard schedule for LLM training and vision models, gentler than step decay.

Explanation

Cosine annealing reduces LR more gently than step decay and enables late fine-tuning with very small rates. Warm restarts periodically reset the LR.

Marketing Relevance

Cosine annealing is the de facto standard for LLM pre-training and vision models. Almost all modern training recipes use it.

Common Pitfalls

Total steps must be known in advance. Warm restarts require tuning of cycle length. Not always better than linear decay.

Origin & History

Loshchilov & Hutter (2017) introduced SGDR (SGD with Warm Restarts), combining cosine annealing with periodic restarts. The Chinchilla paper (2022) used cosine decay for optimal LLM training. Standard since then.

Comparisons & Differences

Cosine Annealing vs. Step Decay

Step decay reduces LR abruptly at fixed intervals; cosine annealing lowers it smoothly and continuously.

Cosine Annealing vs. Linear Decay

Linear decay lowers LR uniformly; cosine annealing decreases slower initially, then faster – maintains a higher LR longer.

Further Resources

Related Services

Strategy & Intelligence Tech & Integration Consulting

View all terms