Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence
    (Bandit-basierte Empfehlung)

    Bandit-Based Recommendation

    Also known as:
    Contextual Bandit RecSys
    Exploration-Exploitation Recommendation
    Updated: 2/11/2026

    Recommendation systems using multi-armed bandits to balance exploration of new items with exploitation of known preferences.

    Quick Summary

    Bandit-based recommendations learn online and balance exploration of new items with exploitation of proven ones – ideal for fast feedback loops.

    Explanation

    Contextual bandits use user context as features and learn online which items are optimal for which user contexts. No batch retraining needed – continuous learning.

    Marketing Relevance

    Ideal for marketing personalization: website banners, email subject lines, product recommendations – anything with fast feedback loops.

    Example

    A news feed uses LinUCB to find the optimal mix of known and new articles for each user context.

    Common Pitfalls

    Delayed rewards (e.g., conversions after days) are hard to handle. Reward signal design is crucial.

    Origin & History

    Li et al. (2010) introduced LinUCB for personalized news recommendations. Yahoo and Microsoft early adopted bandits for ad selection. Contextual bandits have been standard for online personalization since 2020.

    Comparisons & Differences

    Bandit-Based Recommendation vs. A/B Testing

    A/B testing statically tests few variants; bandits continuously optimize across many options.

    Related Services

    Related Terms

    👋Questions? Chat with us!