Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence
    (Step Decay)

    Step Decay (Learning Rate)

    Also known as:
    Step LR
    Staircase Schedule
    MultiStep LR
    Updated: 2/12/2026

    Simplest learning rate schedule strategy that reduces the LR by a factor after fixed intervals (epochs or steps).

    Quick Summary

    Step decay reduces LR abruptly at fixed intervals – the simplest schedule strategy, but now mostly replaced by cosine annealing.

    Explanation

    Typical: LR is reduced by factor 0.1 every 30 epochs. Simple to implement and understand, but less smooth than cosine annealing.

    Marketing Relevance

    Step decay was standard in computer vision for years (ResNet paper). Now mostly replaced by cosine annealing or one-cycle.

    Common Pitfalls

    Abrupt LR drops can destabilize training. Timing and factor must be manually tuned. Less efficient than smooth schedules.

    Origin & History

    Step decay was standard in ImageNet training recipes (AlexNet 2012, VGG 2014, ResNet 2015). Cosine annealing (2017) and one-cycle (2018) showed consistently better results and replaced step decay as standard.

    Comparisons & Differences

    Step Decay (Learning Rate) vs. Cosine Annealing

    Step decay is staircase (abrupt jumps); cosine annealing is smooth and continuous – gentler transition usually leads to better results.

    Step Decay (Learning Rate) vs. Exponential Decay

    Step decay lowers discretely at fixed points; exponential decay lowers continuously with exponential factor. Exponential is smoother but harder to tune.

    Related Services

    Related Terms

    👋Questions? Chat with us!