Membership Inference Attack
An attack that determines whether a specific data point was included in the training dataset of an ML model.
Membership Inference Attacks determine whether specific data was used to train a model – a critical privacy risk for GDPR compliance.
Explanation
The model behaves differently on training data (higher confidence, lower loss). Attackers train a "shadow model" and a classifier that distinguishes members from non-members.
Marketing Relevance
Privacy risk: If it's provable that patient data was in the model, it violates GDPR. LLMs are also vulnerable to membership inference.
Example
An attacker queries a health AI model about specific patient profiles. High confidence scores reveal which patients were in the training set.
Common Pitfalls
Hard to prevent without accuracy loss. Differential Privacy helps but with tradeoffs. Overfitting increases vulnerability.
Origin & History
Shokri et al. (2017) formalized Membership Inference Attacks against ML models. Follow-up work showed vulnerabilities in LLMs, GANs, and diffusion models. Carlini et al. (2021) demonstrated training data extraction from GPT-2.
Comparisons & Differences
Membership Inference Attack vs. Model Extraction
Model Extraction wants to clone the model; Membership Inference only wants to know which data was in training.
Membership Inference Attack vs. Data Poisoning
Data Poisoning actively manipulates training data; Membership Inference is a passive information attack.