Skip to main content
    Skip to main contentSkip to navigationSkip to footer
    Artificial Intelligence

    Model Extraction Attack

    Also known as:
    Model Stealing
    Model Theft
    API-Based Model Attack
    Model Cloning
    Updated: 2/11/2026

    An attack where an adversary creates a functionally equivalent copy of an ML model through systematic API queries.

    Quick Summary

    Model extraction attacks copy ML models through systematic API queries – a growing IP risk for AI-as-a-Service.

    Explanation

    The attacker sends crafted inputs to the API and uses outputs to train a surrogate model. Decision-based and score-based attacks exist. Countermeasures: rate limiting, output perturbation, watermarking.

    Marketing Relevance

    For API-based AI products (chatbots, classifiers), model extraction is an IP risk – competitors can copy models cost-effectively.

    Example

    A competitor uses 100,000 API calls to your sentiment classifier to train a local model with 95% agreement – without their own training data.

    Common Pitfalls

    Complete protection is impossible with public APIs. Rate limiting alone isn't enough. Watermarking can be removed through fine-tuning.

    Origin & History

    Tramèr et al. (2016) demonstrated model extraction against BigML and Amazon ML. Orekondy et al. (2019) demonstrated Knockoff Nets. Krishna et al. (2020) extracted BERT models. The topic grows with LLM APIs.

    Comparisons & Differences

    Model Extraction Attack vs. Membership Inference

    Membership inference checks if data was in training; model extraction clones the entire model.

    Related Services

    Related Terms

    👋Questions? Chat with us!