Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence
    (Two-Tower-Modell)

    Two-Tower Model

    Also known as:
    Dual Encoder
    Bi-Encoder
    Two-Tower Architecture
    Updated: 2/11/2026

    An architecture with two separate encoders (user tower, item tower) whose embeddings are efficiently matched via similarity search.

    Quick Summary

    Two-tower models encode users and items separately and match via similarity search – the standard architecture for RecSys at billions of items.

    Explanation

    Each tower independently produces embeddings. At inference, item embeddings are precomputed and efficiently searched via ANN (approximate nearest neighbor). Scales to billions of items.

    Marketing Relevance

    Two-tower is the standard architecture for candidate generation in large RecSys (YouTube, Google, Meta).

    Example

    Google Search uses two-tower for ad retrieval: user context and ad features are encoded separately, then matched via ANN.

    Common Pitfalls

    Dot product interaction is less expressive than cross-attention. Negative sampling strategy is crucial.

    Origin & History

    YouTube (Covington et al., 2016) popularized the architecture. Google published the dual encoder for retrieval in 2019. Meta's DLRM and Google's TF-Ranking formalized two-tower as industry standard.

    Comparisons & Differences

    Two-Tower Model vs. Cross-Encoder

    Cross-encoder processes user+item jointly (more accurate but slow); two-tower encodes separately (fast, scalable).

    Related Services

    Related Terms

    👋Questions? Chat with us!