Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence

    Weight Normalization

    Also known as:
    Weight Norm
    WN
    Updated: 2/12/2026

    Weight Normalization reparameterizes weight vectors into direction and magnitude – an alternative to batch norm without batch dependency.

    Quick Summary

    Weight Normalization separates weights into direction and magnitude – simpler than BatchNorm, no batch statistics needed.

    Explanation

    w = g · (v / ||v||), where g is magnitude and v is direction. Simpler than BatchNorm (no running statistics), applied directly to weights instead of activations.

    Marketing Relevance

    Useful where BatchNorm is not applicable (e.g., RNNs, generative models, reinforcement learning).

    Origin & History

    Salimans & Kingma (OpenAI, 2016) introduced Weight Normalization. It found use in WaveNet (2016) and some RL systems. Less common than BatchNorm/LayerNorm, but conceptually influential.

    Comparisons & Differences

    Weight Normalization vs. Batch Normalization

    BatchNorm normalizes activations (needs batch statistics); WeightNorm normalizes weights directly (no batch dependency).

    Related Services

    Related Terms

    👋Questions? Chat with us!