Alignment
The problem of ensuring that AI systems pursue the intended goals and values of their developers and society.
AI Alignment ensures that AI systems pursue human goals and values – the fundamental problem of AI safety.
Explanation
Alignment problems: Outer alignment (do we specify the right goals?), inner alignment (does the model actually pursue these goals?), distributional shift (behaves differently in new situations). RLHF is current solution.
Marketing Relevance
Alignment is also marketing-relevant: Does the AI assistant actually pursue brand goals? Does it optimize for customer value or short-term metrics?
Example
A recommendation system is "aligned" on engagement – but shows polarizing content. Better: Alignment on customer lifetime value and satisfaction.
Common Pitfalls
Goodhart's Law: When a metric becomes a target, it ceases to be a good metric. Alignment on proxies instead of real values. Gaming.
Origin & History
Alignment research was popularized by Stuart Russell's work and Nick Bostrom's "Superintelligence" (2014). OpenAI's founding mission emphasizes alignment. RLHF (2017+) became first practical solution.
Comparisons & Differences
Alignment vs. AI Safety
AI Safety is the overall field; Alignment is the specific problem of AI doing what we want.
Alignment vs. AI Ethics
AI Ethics asks "what should we want?"; Alignment asks "how do we get AI to do it?".