Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence

    Sinusoidal Positional Encoding

    Also known as:
    Sin/Cos Encoding
    Fourier Positional Encoding
    Fixed Positional Encoding
    Updated: 2/10/2026

    The original positional encoding from the Transformer paper using sine and cosine functions of different frequencies.

    Quick Summary

    Sinusoidal encoding uses sin/cos waves of different frequencies as position signal – the historically first solution from the Transformer paper (2017).

    Explanation

    PE(pos, 2i) = sin(pos/10000^(2i/d)), PE(pos, 2i+1) = cos(pos/10000^(2i/d)). Different dimensions have different wavelengths (2π to 10000·2π). Advantage: Can theoretically generalize to arbitrary lengths as it's deterministic.

    Marketing Relevance

    Historically important as the first solution for position information in Transformers – today mostly replaced by RoPE or learned embeddings.

    Common Pitfalls

    Does not generalize well to unseen lengths in practice. Absolute position information instead of relative. Modern LLMs use RoPE instead of sinusoidal.

    Origin & History

    Vaswani et al. (2017) chose sinusoidal encoding for its ability to represent relative positions through linear transformation. BERT (2018) replaced it with learned positional embeddings. RoPE (2021) and ALiBi (2022) superseded both.

    Comparisons & Differences

    Sinusoidal Positional Encoding vs. Learned Positional Embeddings

    Sinusoidal is deterministic (no parameters); learned embeddings are trained – more flexible but limited to training length.

    Sinusoidal Positional Encoding vs. RoPE

    Sinusoidal adds position to embedding; RoPE rotates Q/K vectors – captures relative positions better and scales with techniques like YaRN.

    Related Services

    Related Terms

    👋Questions? Chat with us!