AI Glossary
Clear explanations of the most important terms in AI, marketing, and technology.
A
A* Search
A* (pronounced "A-star") is a classical search algorithm that finds the shortest path between a start and a goal node in a graph by minimizing the total cost f(n) = g(n) + h(n) at every node — the sum of actual path cost so far and an estimated remaining distance (heuristic).
A/B Testing
An experiment comparing two variants (A and B) to determine which performs better.
A2A (Agent-to-Agent Protocol)
A2A (Agent-to-Agent) is an open standard initiated by Google for direct communication between autonomous AI agents — regardless of which framework (LangChain, OpenAI, Claude, AutoGen) they were built with.
A2A Commerce
Commerce model in which AI agents conduct purchases, comparisons, and negotiations with other agents on behalf of users or businesses.
A2A Protocol (Agent-to-Agent)
Google's open standard for communication between AI agents from different providers – enables interoperability in multi-agent systems.
Abductive Logic Programming (ALP)
A framework in logic programming that allows certain premises to be left unspecified and then infers plausible explanations for observations.
Abductive Reasoning
A form of logical inference that starts from an observation and seeks the simplest and most likely explanation for it.
Ablation
In AI research, an ablation refers to the removal or disabling of a component of a system to assess that component's impact on the overall performance.
Abstract Data Type
A conceptual model of a data structure defined by its behavior (operations and properties) rather than a specific implementation.
Abstraction
The process of simplifying complexity by focusing on high-level concepts and hiding lower-level details.
Accelerating Change
The perceived increase in the rate of technological innovation and societal progress over time.
Accountability
The obligation to take responsibility for AI decisions and be able to explain their impacts.
Accuracy
A metric in machine learning that measures the proportion of correct predictions made by a model out of all predictions made.
Action Language
A formal language used to describe state changes in a system – how actions affect the state of the world over time.
Action Model Learning
A machine learning approach focused on enabling an AI agent to learn the outcomes and requirements of its actions within an environment.
Action Schema
Action Schema is an extension of the schema.org vocabulary (PotentialAction, Schema.Action) that lets websites machine-readably declare which actions (buy, book, reserve, subscribe, contact) a user or agent can perform on the page.
Action Selection
The process by which an intelligent agent decides "what to do next," choosing the next action from a set of possible actions.
Actionable Intelligence
Information that can be directly acted upon to make decisions or improvements, often derived from data analysis or AI insights.
Activation Function
A mathematical function used in artificial neural networks to determine the output of a node (neuron) given an input or set of inputs.
Active Learning
ML strategy where the model selects the most informative samples for labeling.
Actor-Critic
RL architecture with two components: an actor (policy) selects actions, a critic (value function) evaluates them – combines strengths of policy gradient and value-based methods.
Ad Exchange
An Ad Exchange is a digital marketplace connecting publishers (supply) and advertisers (demand), trading ad inventory through real-time auctions.
Ad Rank
Google's score determining an ad's position and visibility in search results.
Adafactor
Memory-efficient optimizer that replaces Adam's second moment with a factorized approximation – saves up to 50% optimizer memory.
AdaGrad
Optimizer that adaptively adjusts the learning rate per parameter – frequently updated parameters get smaller rates, rare ones get larger.
Adam Optimizer
Adaptive optimization algorithm with momentum and adaptive learning rates.
AdamW
Corrected variant of the Adam optimizer that decouples weight decay from the gradient update – the de facto standard for LLM and transformer training.
Adaptive Algorithm
An algorithm that changes its behavior or parameters in response to the problem instance or environment as it runs, aiming to improve performance on the fly.
Adaptive Learning
An educational methodology (often implemented with AI) that customizes learning content and pace to the individual needs and performance of each learner.
Adaptive Neuro-Fuzzy Inference System
A hybrid system that combines neural networks and fuzzy logic principles to create a model capable of learning from data while employing human-like reasoning.
Admissible Heuristic
A heuristic h(n) is called admissible if it never overestimates the true remaining cost from node n to the goal — i.e. it always provides an optimistic lower bound. This property guarantees that search algorithms like A* find an optimal path.
Adversarial Attacks
Targeted input manipulations that cause AI systems to misclassify or behave incorrectly.
Adversarial Robustness
The ability of an ML model to maintain correct predictions even when inputs are deliberately manipulated.
AEO (Answer Engine Optimization)
Answer Engine Optimization (AEO) is the discipline of structuring content and brands so they get chosen as citation or answer sources by AI-driven answer engines (ChatGPT Search, Perplexity, Google AI Overviews, Claude).
Agent Architecture
The underlying structure and components of an intelligent agent system, describing how the agent is organized internally to sense, think, and act.
Agent Handoff
The process where an AI agent passes a task to another specialized agent or to a human.
Agent Loop
The iterative cycle of an AI agent: Observe → Think → Act → Evaluate result → Repeat until goal is reached.
Agent Memory
Systems for storing information that AI agents can use beyond the context window – from short-term scratchpads to persistent knowledge stores.
Agent Orchestration
Coordination and control of multiple AI agents to execute complex workflows, including task distribution, communication, and error handling.
Agent Swarms
A system of multiple specialized AI agents that work together autonomously, distribute tasks among themselves, and achieve complex goals in a coordinated manner – inspired by swarm behavior in nature.
Agent-to-Agent (A2A)
Direct communication between autonomous AI agents without human mediation – e.g., for negotiation, booking, or data exchange.
Agent-to-Agent Protocol (A2A)
An open protocol developed by Google that standardizes communication and collaboration between different AI agents.
AgentBench
A benchmark for evaluating LLM agents in 8 different interactive environments like websites, databases, games, and operating systems.
Agentic AI
AI systems that can autonomously pursue goals, make decisions, use tools, and execute multi-step tasks without continuous human guidance.
Agentic Coding
A paradigm where AI agents autonomously write, test, debug, and iterate on code – with minimal human intervention.
Agentic Commerce
Agentic commerce describes a new form of commerce in which autonomous AI agents act on behalf of consumers or businesses to anticipate needs, compare options, negotiate, and execute transactions — without a human approving every single step.
Agentic Engine Optimization (AEO)
Optimization of brands, products, and APIs for selection by autonomous AI agents in agentic workflows.
Agentic Marketing
Agentic marketing is the practice of letting autonomous AI agents plan, execute and optimize marketing campaigns based on goals — instead of executing predefined workflows or templates.
Agentic RAG
Agentic RAG is an evolution of retrieval-augmented generation in which an AI agent dynamically decides when, which and how many sources to query — instead of following a rigid retrieval pipeline with fixed top-k vector search.
Agentic Security
Multi-agent systems that autonomously detect, triage, and neutralize threats – beyond classic SOC automation.
AI Abundance Economy
Economic model in which AI drives the production cost of knowledge, software, and content toward zero, with scarcity primarily in energy, compute, and attention.
AI Accelerator
Specialized hardware designed specifically to speed up artificial intelligence tasks, particularly the heavy mathematical computations in machine learning.
AI Act Compliance
Operational implementation of EU AI Act requirements in organizations – from risk classification to logging obligations.
AI Agent
An autonomous software system that uses AI to independently plan and execute tasks.
AI Agent
Autonomous AI system that independently plans tasks, uses tools, and executes multiple steps without human intervention to achieve a goal.
AI Agents
Autonomous AI systems that independently pursue goals, create plans, use tools, and interact with their environment – beyond simple prompt-response.
AI Agents for Search
Autonomous AI systems that conduct complex research – searching multiple sources, synthesizing, drawing conclusions.
AI Agents Frameworks
Software frameworks and libraries that simplify the development of autonomous AI agents by providing pre-built components for planning, tool use, memory, and orchestration.
AI Alignment
The research field and practice of developing AI systems that understand and reliably pursue human values, intentions, and goals.
AI Art
Visual art created wholly or partially by AI systems – from prompt-based image generation to interactive installations.
AI Audit
The independent examination of AI systems for fairness, bias, security, compliance, and performance by external or internal auditors.
AI Avatars
Computer-generated, photorealistic digital humans animated by AI that can present any content.
AI Code Review
AI-powered automatic review of code changes for bugs, security vulnerabilities, best practices, and style.
AI Coding
Use of AI systems to support, accelerate, and automate software development – from code completion to full-stack generation.
AI Coding Assistants
AI-powered tools that assist developers with programming – from autocomplete to code generation to complete feature implementations.
AI Copyright
The legal question of who owns copyrights to AI-generated content and how training data usage should be legally classified.
AI Debugging
The use of AI to automatically identify, analyze, and fix software bugs.
AI Developer Tools
The ecosystem of AI-powered tools that support and accelerate software development at all levels.
AI Discovery
AI systems that proactively recommend relevant content, products, or information – without explicit search query.
AI Ethics
The interdisciplinary field examining moral principles, values, and guidelines for the development, deployment, and societal impact of AI systems on society and individuals.
AI Gateway
Middleware layer between applications and AI model APIs for routing, monitoring, rate limiting, and caching.
AI Governance
The framework of policies, processes, and responsibilities for the responsible development, deployment, and use of AI systems in organizations.
AI Governance Board
Cross-functional corporate body steering AI strategy, risk decisions, use case approvals, and compliance.
AI Influencer
Fully AI-generated personality with consistent appearance and dedicated social media presence for brand collaborations.
AI Liability
The legal responsibility for damages caused by AI systems, and the question of who is liable: developer, operator, or user.
AI Liability
Legal and organizational responsibility for damages caused by AI systems or autonomous agents.
AI Music Generation
AI music generation creates musical pieces from text prompts, melodies, or style specifications – from background music to complete songs.
AI Observability
The practice of real-time monitoring, evaluation, and debugging of AI systems in production – from classical ML models to LLM applications and autonomous agents.
AI Orchestration
The coordinated control and integration of multiple AI models, agents, and tools to execute complex, multi-step tasks in an automated workflow.
AI Overviews (Google)
AI Overviews are AI-generated answer blocks that Google has been displaying at the top of search results since 2024 — powered by Gemini models that summarize multiple web sources and link to citations.
AI Pair Programming
Programming approach where an AI acts as "partner" – continuously thinking along, suggesting, and reviewing code.
AI Personalization
Using AI to adapt marketing content, products, and experiences to individual users in real-time.
AI Red Teaming
Systematic testing of AI systems by an attacker team to identify weaknesses, bias, and misuse potential.
AI Regulation
The entirety of legal regulations and guidelines governing the development, deployment, and impact of AI systems.
AI Risk Management
The systematic identification, assessment, and management of risks that can arise from AI systems.
AI Safety
The research field focused on making AI systems safe, controllable, and aligned with human values.
AI Search
Search engines that use LLMs to understand queries and deliver direct answers instead of link lists.
AI Search Optimization (AIO)
Strategy for maximizing brand visibility across all AI search surfaces – from answer engines to agentic browsers.
AI Shopping Agent
An AI Shopping Agent is an autonomous AI system that researches, compares, negotiates and purchases products on behalf of a consumer — from simple recommendations (Perplexity Shopping) to fully automated procurement with AP2 mandates (ChatGPT Operator, Claude Computer Use).
AI Slop
Pejorative term for low-quality, mass-produced AI-generated content flooding the internet that provides no real value.
AI Targeting
Using AI to identify and reach the right audience for advertising – based on behavioral and prediction models.
AI Transparency
The disclosure of how AI systems work, were trained, and make decisions, as well as labeling AI-generated content.
AI Watermarking
Techniques for embedding invisible markers in AI-generated content to prove its origin and enable detection of deepfakes.
AI-Complete
A problem is termed AI-complete if solving it by machine would essentially require general human-level intelligence.
AI-Developed Zero-Day
Previously unknown software vulnerability that an AI system independently identified and/or weaponized.
AI-Powered CDP
Customer Data Platforms with integrated AI/ML capabilities for automated segmentation, predictions, and activation.
Aider Polyglot Benchmark
Coding benchmark testing LLMs on real-world multi-file edits across multiple programming languages.
Algorithmic Discrimination
Algorithmic discrimination refers to the systematic disadvantage of certain groups by algorithmic decision systems – often as a result of biased training data or unbalanced model architectures.
Algorithmic Efficiency
Algorithmic efficiency measures how economically an algorithm uses computation time, memory, and energy – typically expressed in Big-O notation for scaling behavior.
Algorithmic Impact Assessment
Systematic evaluation of the potential impacts of an algorithmic system on individuals, groups, and society before and during deployment.
Algorithmic Probability
A theoretical measure that assigns a probability to an observation by considering all possible algorithms that could produce it, weighted by their simplicity.
ALiBi (Attention with Linear Biases)
A method for position encoding that adds linear biases directly to attention scores instead of learning position embeddings.
Alignment
The problem of ensuring that AI systems pursue the intended goals and values of their developers and society.
Alignment Tax
The performance loss caused by alignment and safety training – a model becomes safer but potentially less capable.
Alpha-Beta Pruning
An optimization technique for the minimax algorithm that prunes parts of the game tree without affecting the result.
Amazon Rufus
Amazon's AI shopping assistant that answers product questions, makes comparisons, and provides recommendations directly in the Amazon app.
Amazon SageMaker Pipelines
AWS managed service for CI/CD-capable ML pipelines with integrated experiment tracking, model registry, and deployment automation.
Analytics
The systematic analysis of data to gain insights and support decision-making.
Anchor Box
Predefined bounding boxes of various sizes and aspect ratios that serve as starting points for object detection.
Anomaly Detection
Identification of unusual patterns or outliers in data.
Answer Engine Optimization (AEO)
Optimizing content to be cited in AI-generated answers – the evolution of SEO for AI search engines.
Ant Colony Optimization
A probabilistic optimization technique inspired by the behavior of ants foraging for food and their use of pheromone trails.
Anthropic
An AI safety company founded by former OpenAI researchers, known for Claude – one of the most advanced LLMs focused on safety and honesty.
Anytime Algorithm
An anytime algorithm is an algorithm that can return a valid — though not yet optimal — solution at any intermediate stage and monotonically improves solution quality with additional compute time.
AP2 (Agent Payments Protocol)
The Agent Payments Protocol (AP2) is an open standard initiated in 2025 by Google together with 60+ partners (Mastercard, PayPal, American Express, Coinbase and others) that lets AI agents securely and verifiably trigger payments on behalf of users or businesses.
Apache Airflow
Open-source platform for orchestrating complex data and ML workflows as DAGs (Directed Acyclic Graphs).
API (Application Programming Interface)
An interface that allows software applications to communicate with each other and exchange data.
API Rate Limiting
Mechanisms that limit the number of API requests per time unit – critical for AI API costs and system stability.
Approximation Error
The difference between an exact, true value and an approximate value that is used or obtained by an algorithm or model.
ARC (AI2 Reasoning Challenge)
A multiple-choice benchmark with natural science questions at elementary and middle-school level in Easy and Challenge sets.
ARC-AGI-2
Benchmark by the ARC Prize Foundation that measures general reasoning ability of AI systems via abstract pattern tasks.
ARIMA (AutoRegressive Integrated Moving Average)
A classic statistical model for time series forecasting that combines autoregression, differencing, and moving averages.
Arize AI
An AI observability platform that runs over 50 million evaluations per month and serves over 1 trillion inferences. Arize helps monitor, evaluate, and optimize ML models and generative AI applications.
ARPU (Average Revenue Per User)
The average revenue per user over a specific time period.
Array
An array is a contiguous data structure storing elements of the same type (in many languages) accessed by index.
Artificial General Intelligence (AGI)
A hypothetical form of AI that possesses human-like cognitive abilities across all domains and can learn and adapt autonomously.
Artificial Neural Network (ANN)
An Artificial Neural Network (ANN) is a computational model inspired by the biological brain, consisting of layers of connected neurons that can learn to extract complex patterns from data by adjusting weights.
Assessment
Assessment is the measurement of knowledge, skill, or performance—used to diagnose current ability, provide feedback, and certify learning outcomes.
Attention Mechanism
A neural network mechanism that allows models to dynamically "focus" on relevant parts of the input – the key innovation behind modern LLMs.
Attention Pooling
Attention pooling aggregates a sequence of vectors into a single representation vector by giving learned attention weights more importance to the most relevant elements.
Attention Sink
A phenomenon in LLMs where the first token (BOS) receives disproportionately high attention, even when semantically irrelevant.
Attribution
Assigning credit to marketing touchpoints that contributed to a conversion—determining which channels or campaigns are effective.
Attributional Calculus
A logical framework combining predicate logic with multi-valued (fuzzy) logic to represent attributes of entities in an intuitive, human-readable way.
AUC (Area Under the Curve)
The area under the ROC curve – a single number (0-1) summarizing the overall quality of a binary classifier.
Audience
The group of people a company wants to reach with its marketing messages.
Audio Deepfake
AI-generated audio recordings that convincingly imitate a real person and can be used for fraud, misinformation, or manipulation.
Audio Generation
The creation of audio content through AI models – from music to sound effects to speech and ambient sounds.
Audio Language Models
AI models that can directly understand and generate audio – from speech recognition to music analysis to natural speech generation with emotions and intonation.
Audit Logging
Audit logging records security-relevant events (access, policy decisions, admin changes, tool actions) in an immutable or tamper-evident way.
Authorization
Authorization determines what an authenticated identity is allowed to do (permissions), such as reading specific data or executing specific actions.
Auto-Complete
Auto-complete is a feature that, during text entry, automatically offers matching completion suggestions — based on dictionaries, search history, statistical language models, or, since 2023, generative LLMs.
Autoencoder
A type of neural network designed to learn a compressed representation (encoding) of input data and then reconstruct the original data from this encoding.
AutoGPT
An experimental open-source project that lets GPT-4 autonomously pursue goals – pioneer of the agentic AI movement.
Automata Theory
The branch of computer science and mathematics that deals with abstract machines (automata) and the computational problems they can solve.
Automated Machine Learning
The process of automating the end-to-end process of applying machine learning to real-world problems, including data preprocessing, model selection, and hyperparameter tuning.
Automated Planning
Automated planning is the AI subfield concerned with algorithms that, given an initial state, a goal state, and a set of possible actions, automatically find a sequence of actions (a plan) that achieves the goal.
AutoML (Automated Machine Learning)
AutoML automates parts of the machine learning lifecycle such as model selection, feature preprocessing, hyperparameter tuning, and validation.
Autonomous Agent
An AI agent that pursues goals, makes decisions, and executes actions without human intervention – the highest autonomy level.
Autonomous Driving
The use of AI systems for full or partial control of vehicles without human intervention, classified in SAE Level 0-5.
Autoregressive Model
An autoregressive model generates sequences token by token, where each new token depends on all previous ones – the architecture behind GPT, LLaMA, and all modern LLMs.
Awareness
The first phase in the marketing funnel where potential customers become aware of a brand or product.
B
B2B Marketing
Marketing of products or services to other businesses rather than end consumers.
Backpropagation
An algorithm for computing gradients in neural networks that propagates errors backwards through the network to adjust weights.
Backtesting
Validation of a forecasting model on historical data to estimate out-of-sample performance.
Backtracking
An algorithmic technique that systematically explores all possible solutions and returns to the last decision point when hitting dead ends.
Backward Chaining
An inference strategy that starts from the goal and works backward to find the facts and rules that would prove the goal.
Bag of Words (BoW)
Simplest text representation that represents text as an unordered set of words with frequencies.
Bagging
An ensemble learning method that trains multiple models on bootstrap samples and aggregates their predictions.
Bandit-Based Recommendation
Recommendation systems using multi-armed bandits to balance exploration of new items with exploitation of known preferences.
Batch Normalization
A normalization technique that normalizes activations in neural networks across mini-batches – stabilizing training and enabling higher learning rates.
Batch Processing
Processing large amounts of data in collected blocks rather than real-time.
Batch Size
Number of training examples per gradient update.
Bayesian Optimization
Bayesian optimization is an approach to optimizing expensive black-box functions (e.g., model hyperparameters) using a probabilistic surrogate model and an acquisition function.
Beam Search
Beam search is a heuristic search algorithm that, at every search step, keeps only the k best partial solutions ("beam width") — a compromise between exhaustive breadth-first search (high quality, high cost) and greedy search (low quality, low cost).
Behavioral AI
AI systems that analyze user behavior, recognize patterns, and predict future actions.
Bellman-Ford Algorithm
The Bellman–Ford algorithm computes shortest paths from a single source in a weighted graph and can handle negative edge weights (and detect negative cycles).
Benchmark
A reference point or standard against which performance is measured and compared.
BentoML
Open-source framework for packaging, deploying, and scaling ML models as production-ready APIs.
BERT
BERT (Bidirectional Encoder Representations from Transformers) is a language model developed by Google that processes text bidirectionally, enabling deep contextual understanding.
BERT (Google)
Google's Transformer model for bidirectional language understanding.
BERTScore
A semantic evaluation metric that uses BERT embeddings to measure similarity between generated and reference text.
BGE Embedding
BGE (BAAI General Embedding) is a family of open-source embedding models from Beijing Academy of AI that achieve top results on MTEB.
Bi-Encoder
An encoder architecture that transforms query and document independently into embeddings – enabling fast similarity search over pre-computed vectors.
Bias (AI)
Systematic distortions in AI systems leading to unfair or discriminatory outcomes for certain groups of people, often caused by imbalanced training data or flawed assumptions.
Bias-Variance Tradeoff
Fundamental tradeoff: simple models have high bias (underfitting), complex ones high variance (overfitting).
Bid Management
The optimization of bids in real-time auctions for digital advertising.
Bidding
Bidding is setting offers for ad inventory in auction-based advertising systems to influence delivery, cost, and outcomes.
BIG-Bench
A collaborative benchmark with 200+ tasks created by 400+ researchers to test LLM capabilities beyond existing benchmarks.
Big-O Notation
Big-O notation describes how an algorithm's time or space requirements grow with input size, expressing an upper bound on asymptotic behavior (e.g., O(log n), O(n), O(n²)).
Binary Search
Binary search finds a target value in a sorted list by repeatedly halving the search range.
Bing Copilot
Microsoft's AI-powered search engine combining GPT-4 with Bing search – integrated into Windows, Edge, Office.
BLEU Score
Metric for automatic evaluation of translation quality.
BM25 Ranking
BM25 is a classic lexical ranking function used in information retrieval that scores documents based on term frequency, inverse document frequency, and length normalization.
Boosting
An ensemble learning method that sequentially combines weak learners to create a strong classifier.
Bootstrapping
Statistical resampling method that repeatedly draws samples with replacement from the dataset.
Bounce Rate
The percentage of visitors who leave a website without visiting another page.
BPE (Byte Pair Encoding)
Subword tokenization algorithm that iteratively merges frequent character pairs to create an optimal vocabulary.
Brand Awareness
The extent to which consumers can recognize and recall a brand.
Brand Lift
The measurable improvement in brand metrics (awareness, consideration, preference) from advertising.
Breadth-First Search (BFS)
A graph traversal algorithm that explores all neighbor nodes at the current depth before moving to the next depth level.
Breadth-First Search (BFS)
Breadth-First Search (BFS) traverses a graph level by level, exploring all neighbors of a node before moving deeper.
Brier Score
A metric measuring the quality of probabilistic predictions – MSE on probabilities (0=perfect).
Browser Agent
Autonomous AI agent that operates a web browser to perform tasks like research, bookings, or purchases on behalf of the user.
Business Continuity
Business continuity is the capability to keep critical business functions running during and after disruptions (technical failures, security incidents, disasters).
Business Intelligence
Business Intelligence (BI) is the practice and tooling for transforming data into dashboards, reports, and analyses that support business decision-making.
Buyer Persona
A semi-fictional representation of the ideal customer based on market research and real customer data.
C
C2PA Content Credentials
Open standard for marking the provenance and editing history of digital media, developed by the Coalition for Content Provenance and Authenticity.
CAC (Customer Acquisition Cost)
The average cost to acquire a new customer, including marketing and sales expenses.
Calibration
The process of adjusting a model's predicted probabilities so they reflect actual event probabilities.
Canary Deployment
Deployment strategy where a new version is gradually rolled out to a small percentage of traffic before full deployment.
Canonical URL
A canonical URL is the preferred "official" URL for a piece of content when multiple URLs could show similar or identical content.
Canonicalization
Canonicalization is choosing a single "canonical" representation among multiple equivalent or similar variants (data records or URLs).
Capacity Planning
Capacity planning ensures systems have sufficient resources (compute, storage, network, quotas) to meet demand while maintaining SLOs and controlling cost.
Causal Inference
Causal inference is the discipline of estimating cause-and-effect relationships (what would happen if we changed X), not just correlations.
Causal Masking
Causal masking prevents tokens from attending to future positions – the technique enabling autoregressive generation in decoders like GPT.
CDP (Customer Data Platform)
A platform that unifies customer data from various sources to create comprehensive customer profiles.
CER (Character Error Rate)
Metric for speech recognition and OCR at character level.
Certificate Authority
A Certificate Authority (CA) issues and signs digital certificates, binding public keys to identities within a PKI.
Certified Defense
Defense methods against adversarial attacks that provide mathematically provable robustness guarantees.
Chain of Agents
Architecture pattern where multiple specialized AI agents collaborate sequentially or hierarchically to solve complex tasks.
Chain of Custody
Chain of custody is the documented trail of how an artifact (data, evidence, content) was collected, handled, stored, and accessed—ensuring integrity and accountability.
Chain of Thought
Prompting technique and model capability where the model explicitly articulates its thinking process in intermediate steps before arriving at the final answer.
Chain of Trust
A chain of trust is the ordered set of certificates from a leaf certificate through intermediates up to a trusted root CA.
Chain-of-Thought Prompting
A prompting technique that gets LLMs to lay out their thoughts step by step before giving a final answer – leading to significantly better results on complex tasks.
Changepoint Detection
Detection of time points at which the statistical properties of a time series significantly change.
Chatbot
A software program that simulates conversations with humans, typically through text or voice interfaces.
Chatbot Arena
A public Elo-based leaderboard where users blindly choose between two LLMs – the most important benchmark for LLM ranking.
ChatGPT
A conversational AI system built on large language models that generates human-like responses to user prompts.
ChatGPT Agent
Autonomous mode of ChatGPT that independently executes multi-step tasks in browsers, apps, and files.
ChatGPT Checkout
Feature in ChatGPT that completes purchases directly in the chat interface – without redirecting to merchant websites.
Chief Agent Officer (CAO)
C-level role responsible for strategy, governance, and performance of autonomous AI agents in the enterprise – the evolution of the CMO in the agentic era.
Chinchilla Optimal
The finding that for compute-optimal LLM training, the number of training tokens should scale proportionally to parameter count.
Chunking
The process of dividing large documents into smaller, semantically coherent text segments for efficient embedding and retrieval in RAG systems.
Churn Prediction
The use of statistical or machine learning models to estimate the likelihood that a customer will stop using a product.
CI/CD for ML
Continuous integration and continuous delivery adapted for machine learning workflows with data, code, and model validation.
CIDEr
A metric for image captioning that measures TF-IDF-weighted n-gram similarity.
Citation Optimization
Strategies to increase the likelihood that AI systems cite your content as a source.
Class Imbalance
Situation where one class in the training dataset occurs significantly more frequently than others.
Classification
A supervised ML algorithm that assigns data to predefined categories or classes.
Classifier-Free Guidance (CFG)
Classifier-Free Guidance controls how strongly a diffusion model follows the text prompt – higher values produce more prompt-faithful but potentially over-saturated images.
Claude
Anthropic's family of LLMs, known for long context windows, nuanced responses, and a focus on safety and honesty.
Claude Code
Anthropic's official CLI tool for agentic software engineering: Claude Sonnet 4.6 runs directly in the terminal and edits code repositories autonomously.
Claude Computer Use
Claude's capability to operate a desktop computer: mouse, keyboard, screenshots, and applications like a human user.
Claude Cowork
Collaborative multi-user mode of Claude for joint project work with shared context and role distribution.
Claude Design
Visual design mode of Claude for UI mockups, brand asset generation, and layout iteration via natural language.
Claude Haiku
Anthropic's fastest and most cost-effective AI model, optimized for speed and volume in tasks like classification, chatbots, and real-time processing.
Claude Opus
Anthropic's most powerful and expensive AI model, designed for complex analysis, strategic planning, and tasks requiring the highest cognitive depth.
Claude Opus 4.6
Anthropic's 2026 flagship LLM with extended reasoning, 1M-token context, and native computer-use capabilities.
Claude Skills
Modular system by Anthropic that bundles reusable capabilities (prompt + tools + data) for Claude.
Claude Sonnet
Anthropic's balanced AI model offering optimal balance between quality, speed, and cost – the all-rounder of the Claude family.
ClearML
Open-source MLOps platform for experiment tracking, pipeline orchestration, data management, and model serving.
Click-Through-Rate (CTR)
Ratio of clicks to impressions, expressed as a percentage.
Clickstream Data
A time-ordered record of user interactions (clicks, page views, events) across digital properties such as websites and apps.
CLIP (Contrastive Language–Image Pretraining)
A multimodal model approach that learns aligned representations of images and text by training them to match corresponding image–caption pairs.
Clustering
An unsupervised learning technique that groups data points into clusters such that items in the same cluster are more similar to each other.
Code Generation
The automatic creation of program code by AI models based on natural language descriptions, examples, or partial code snippets.
Codex 5.3
OpenAI's specialized 2026 coding model for agentic software development and long-running tasks in repositories.
Cohen's Kappa
A statistic for measuring inter-rater reliability for categorical ratings, corrected for chance agreement.
Cohere
An enterprise-focused AI company specializing in RAG, embeddings, and multilingual LLMs.
Cohere Embed
Cohere's commercial embedding API with special optimization for retrieval and distinction between query and document embeddings.
Cohort Analysis
Cohort analysis groups users or entities by a shared starting event/time (e.g., signup week) and tracks behavior over time.
ColBERT
ColBERT is a late-interaction retrieval architecture that creates token-level embeddings for query and document, aggregating them via MaxSim during search.
Cold Start Problem
The problem when a system has insufficient data about a new user, item, or context to make accurate predictions or recommendations.
Collaborative Filtering
A recommendation approach that predicts a user's preferences based on the behavior of similar users or similarities between items.
Column Store
A column store database stores data column-by-column, optimizing for analytical workloads (OLAP) and scanning specific fields across many rows.
Comet ML
ML platform for experiment tracking, model production monitoring, and LLM evaluation (Opik).
ComfyUI
ComfyUI is a visual, node-based workflow editor for Stable Diffusion and other diffusion models – the professional standard for complex image generation pipelines.
Command R
Cohere's RAG-optimized language model, specifically developed for enterprise retrieval, multilingual applications, and tool use.
Competitive Advantage
An attribute or capability that enables a company to outperform its competitors and create sustainable economic value.
Computer Use
The ability of AI models to operate computers like humans – interpret screenshots, control mouse and keyboard, navigate through interfaces.
Computer Vision
The AI subfield that enables computers to understand and interpret visual information.
Computer-Use Sandboxing
Secure, isolated execution environment for AI agents that control mouse, keyboard, browser, or desktop – with clearly defined permissions and audit trail.
Conditional Generation
Conditional generation produces outputs based on conditions like text, class, image, or other control signals.
Confidential Computing
An approach where data is protected during processing through hardware-based Trusted Execution Environments (TEEs) – protection not just at-rest and in-transit, but also in-use.
Conformal Prediction
A framework-agnostic method that provides predictions with guaranteed confidence intervals without assumptions about model distribution.
Confounding
A confounder is a variable that influences both the independent and dependent variable, creating a spurious association.
Confusion Matrix
A table that summarizes classification performance by counting true positives, false positives, true negatives, and false negatives.
Consent
Consent is the explicit, informed agreement of a person to the processing of their personal data, as required by GDPR and ePrivacy.
Consistency Model
Consistency models generate images in one or few steps by learning to jump from any point on the diffusion trajectory directly to the result.
Constitutional AI
An approach developed by Anthropic where AI systems are trained according to a set of ethical principles ("constitution") to self-correct and avoid harmful outputs.
Constitutional Classifiers
Upstream classifier models that, based on an explicit "constitution", secure an LLM's inputs and outputs against jailbreaks and policy violations.
Content Creation
Content creation is the planning, production, and publishing of materials (text, images, video, audio) intended to inform, persuade, or engage an audience.
Content Delivery Network (CDN)
Distributed network of servers for fast delivery of web content.
Content Filter
Systems that check and block AI inputs and outputs for unwanted content.
Content Fingerprinting
Content fingerprinting creates a compact signature (fingerprint) of content to enable identification, deduplication, similarity detection, or provenance tracking.
Content Marketing
A marketing strategy focused on creating and distributing valuable content to attract customers.
Content Personalization
Dynamic adaptation of content based on user profile and behavior.
Content Policy
A content policy defines what content is allowed, restricted, or disallowed in a system—covering both inputs and outputs.
Content-Based Filtering
Recommendations based on properties of items a user liked.
Context Caching
An optimization technique that caches computed attention states (key-value pairs) for repeated contexts – saves compute and reduces latency for similar queries.
Context Engineering
The practice of designing, selecting, and structuring the information an LLM receives so it produces more reliable and relevant outputs.
Context Window
The maximum amount of text (measured in tokens) that an AI language model can process and "remember" at once – the larger it is, the more context can be considered.
Contextual AI Targeting
AI-powered ad placement based on page content instead of user tracking – the cookie-less alternative.
Contextual Bandit
A decision-making algorithm that chooses among actions using current context features, while learning from feedback to balance exploration and exploitation.
Continual Learning
The ability of an ML model to continuously learn from new data without forgetting previously learned knowledge – the "lifelong learning" problem of AI.
Continuous Batching
A serving technique that inserts new requests into running batches as soon as other requests complete, instead of waiting for batch completion.
Contrastive Learning
A representation learning approach that trains models to pull similar pairs closer and push dissimilar pairs apart in embedding space.
ControlNet
ControlNet is a neural network architecture that adds additional conditions (edges, pose, depth) to diffusion models, enabling precise control over image generation.
Convergence
The point where a model stops improving significantly – the loss stabilizes and further epochs bring no progress.
Conversational AI
Conversational AI refers to AI systems that can conduct natural, human-like conversations via text or voice – from chatbots to voice agents.
Conversational Search
Conversational Search enables information retrieval through natural dialogs instead of rigid keywords – the future of search engines and enterprise search.
Conversion Rate Optimization (CRO)
The systematic process of increasing the percentage of users who complete a desired action through experimentation and UX improvements.
Convolutional Neural Network (CNN)
A neural network architecture that uses convolution operations to learn hierarchical feature representations from grid-like data such as images.
Copilot Agent
Customizable AI agent within Microsoft's Copilot platform that accesses enterprise data and embeds into Teams, Outlook, and Microsoft 365.
Copywriting
Writing advertising copy and marketing content to persuade and convert.
Coreference Resolution
Identifying all mentions in text that refer to the same entity (e.g., "Angela Merkel" → "she" → "the chancellor").
Cosine Annealing
A learning rate schedule strategy that gently reduces the learning rate from a maximum value to near zero following a cosine curve.
Cosine Similarity
A measure of similarity between two vectors that calculates the cosine of the angle between them, independent of their magnitude.
Cost Control
Systematic processes for monitoring, managing, and optimizing expenditures to achieve financial goals and deploy resources efficiently.
Cost per Acquisition (CPA)
Average cost for a desired action like purchase or signup.
Counterfactual Explanation
Explanation method that shows what minimal input change would have led to a different model outcome.
CPC (Cost Per Click)
The pricing model where advertisers pay for each click on their ad.
CPM (Cost Per Mille)
The cost per 1,000 impressions of an ad.
Creativity
The ability to generate original and valuable ideas, concepts, or solutions that go beyond conventional thinking.
CrewAI
A Python framework for multi-agent systems where agents work together as a "crew" with defined roles.
Crisis Communication
Crisis communication is the strategy and execution of messaging during incidents that threaten reputation, trust, or operations.
Cross-Attention
Cross-attention computes attention between two different sequences – e.g., between text conditioning and image generation in diffusion models.
Cross-Encoder
An encoder architecture that processes query and document together and outputs a relevance score – more precise than bi-encoders but slower.
Cross-Entropy Loss
Loss function for classification tasks based on information theory.
Cross-Validation
A technique for evaluating model performance by training and testing on different data subsets.
Cryptography
The science of secure communication through mathematical methods that encrypt data, ensure integrity, and prove authenticity.
CSS
CSS (Cascading Style Sheets) is the styling language of the web that defines the visual appearance of HTML elements – colors, layouts, animations, and responsive design.
CTA (Call to Action)
A prompt for the user to take a specific action, such as "Buy Now" or "Learn More".
CTC (Connectionist Temporal Classification)
CTC is a training algorithm for sequence-to-sequence problems where input and output have different lengths – the key to modern ASR.
CTR (Click-Through Rate)
The percentage of users who click on a link or ad relative to the total number of impressions.
Curriculum Learning
Training strategy where samples are presented in a meaningful order – from easy to hard, similar to a curriculum.
Cursor
An AI-native code editor (VS Code fork) that offers deep AI integration for code generation, refactoring, and natural language programming.
Custom GPT
GPT tailored to a specific use case with its own prompt, knowledge base, and tool set, hosted by OpenAI.
Customer Data Platform (CDP)
Central system for unifying customer data from all sources.
Customer Journey
The entire experience of a customer with a brand, from initial awareness to long-term loyalty.
Customer Lifetime Value
Projected total value of a customer over the entire business relationship.
Customer Lifetime Value (LTV)
Total expected revenue from a customer over the entire relationship.
CutMix
Data augmentation technique that cuts out a rectangular region from one image and replaces it with a region from another image.
Cyclical Learning Rate (CLR)
Learning rate schedule that cyclically varies the LR between a minimum and maximum – prevents stagnation and helps overcome saddle points.
D
DAG (Directed Acyclic Graph)
A directed graph with no cycles, meaning you cannot start at a node and follow directed edges to return to the same node.
Dagster
Open-source orchestration platform with software-defined assets approach for data and ML pipelines.
DALL-E 3
OpenAI's latest text-to-image generation, integrated into ChatGPT, known for precise prompt following and text rendering.
Dashboard
A visual interface that presents key metrics, trends, and alerts to support decision-making.
Data Augmentation
Techniques for artificially expanding training data through transformations.
Data Catalog
A searchable inventory of an organization's data assets including metadata, ownership, and documentation.
Data Clean Room
A secure environment where multiple parties can combine their data for joint analyses without sharing raw data.
Data Dictionary
Documentation that defines the meaning, format, allowed values, and usage of data fields.
Data Drift
The change in statistical properties of input data over time, which can degrade model performance.
Data Enrichment
Adding additional attributes to existing data—via internal joins or external sources (firmographic providers, geo data).
Data Governance
The framework for policies, processes, and responsibilities to manage data assets in an organization.
Data Labeling
Process of annotating data with ground truth for supervised learning.
Data Lake
Central storage for large amounts of unstructured and structured data.
Data Layout
The physical or logical arrangement of data in memory or on storage media, which influences access speed, cache efficiency, and processing performance.
Data Leakage
Situation where information from the test set or the future leaks into training, producing unrealistically good results.
Data Lineage
Data lineage describes where data comes from, how it moves through systems, and how it is transformed into downstream datasets and outputs.
Data Mesh
Decentralized approach to data architecture with domain-oriented data products.
Data Mining
The process of discovering patterns, anomalies, and relationships in large datasets using statistical and machine learning methods.
Data Parallelism
The simplest form of distributed training: Each GPU holds a complete model copy and processes different data batches – gradients are synchronized.
Data Pipeline
A sequence of processes that moves and transforms data from sources to destinations (lake, warehouse, feature store, vector index).
Data Poisoning
An attack where manipulated data is injected into the training process to deliberately influence model behavior.
Data Preprocessing
Transforming raw data into a form suitable for modeling or analysis (cleaning, normalization, encoding).
Data Processing Agreement (DPA)
A legally binding contract between data controller and data processor that governs the terms for processing personal data according to GDPR.
Data Structure
An organized method for storing and managing data that enables efficient operations like searching, inserting, and deleting.
Data Validation (ML)
Automated checking of data quality, schema conformity, and statistical properties in ML pipelines.
Data Visualization
The graphical representation of data to communicate insights and patterns.
Data Warehouse
A system optimized for structured analytics queries over curated, cleaned data—often with strong governance and performance.
Databricks
Databricks is a unified analytics platform that combines data engineering, data science, and machine learning on Apache Spark.
Datasheets for Datasets
Standardized documentation for ML datasets describing provenance, composition, collection methods, recommended use, and known limitations.
DBSCAN
DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering algorithm that finds clusters based on density of data points and automatically identifies outliers.
DDIM (Denoising Diffusion Implicit Model)
DDIM is an accelerated sampling algorithm for diffusion models enabling deterministic generation with significantly fewer steps.
DDPM (Denoising Diffusion Probabilistic Model)
DDPM is the foundational framework for diffusion models that generates images by progressively denoising from pure noise.
Decision Making
Decision making is the process of selecting an action (or non-action) among alternatives based on goals, evidence, constraints, and uncertainty.
Decision Support System (DSS)
A Decision Support System (DSS) helps people make better decisions by combining data, models, and user interfaces.
Decision Theory
Decision theory studies how agents should make choices under uncertainty, often by maximizing expected utility subject to constraints.
Decision Threshold
The cutoff used to convert a model score/probability into an action (e.g., approve/deny, route/escalate).
Decision Tree
An ML model that represents decisions as a tree structure with branches based on feature values.
Decoder
The part of a model that transforms a compressed representation back to the original format.
Decoding
The process of converting encoded data or signals back to their original or usable form, in ML specifically the token-by-token generation of outputs.
Decoding Strategy
A decoding strategy is the method used to convert a model token probability distribution into an actual output sequence.
Deductive Reasoning
A form of logical inference where specific conclusions are drawn from general premises—if the premises are true, the conclusion is guaranteed to be true.
Deduplication
Deduplication is identifying and removing duplicate (or near-duplicate) items to reduce redundancy and improve quality.
Deep Compression
A three-stage compression pipeline (Pruning → Quantization → Huffman Coding) that can compress neural networks by 35-49x – the foundational work of model compression.
Deep Learning
A subfield of machine learning that uses deep neural networks with many layers to learn complex patterns from data.
Deep Reinforcement Learning
Reinforcement learning that uses deep neural networks to learn policies that choose actions to maximize long-term reward.
Deepfake
Deepfakes are AI-generated or -manipulated media (video, audio, images) showing people doing or saying things that never happened.
Deepfake Detection
Technologies and methods for identifying AI-generated or manipulated media content such as videos, audios, and images.
DeepSeek
Chinese AI startup developing powerful open-source language models, competing with Western providers at significantly lower costs.
DeepSeek R1
An open-source reasoning model from DeepSeek that competes with GPT-4 and Claude on complex thinking and coding tasks.
DeepSeek V4
Open-weight flagship by DeepSeek that reaches comparable benchmarks at 1/10 the training cost of Western models.
DeepWalk
A graph embedding algorithm that combines random walks on graphs with Word2Vec to learn node representations.
Default Reasoning
Default reasoning draws conclusions using 'defaults' that hold in typical cases, while allowing exceptions when new information arrives.
Demand Forecasting
Prediction of future demand based on historical data and factors.
Demographic Parity
Fairness criterion: A model satisfies demographic parity when prediction rates (e.g., approval rate) are equal across all protected groups.
Denoising
Denoising is the process of removing noise from a signal; in diffusion models, it's the iterative transformation from noisy latents to a clean sample.
Dense Passage Retrieval
A retrieval approach using bi-encoder embeddings for query and passages – the foundation of modern semantic search.
Dense Retrieval
Retrieval method that uses dense vector representations (embeddings) to find semantically similar documents.
Dependency Parsing
Analyzing the grammatical structure of a sentence by identifying dependency relationships between words.
Depth Estimation
Predicting depth values (distances) for every pixel of a 2D image to generate a 3D depth map.
Depth-First Search (DFS)
A graph traversal algorithm that goes as far as possible along a path before backtracking and exploring alternative paths.
Depth-First Search (DFS)
Depth-First Search (DFS) traverses a graph by going as deep as possible along one path before backtracking.
Depthwise Separable Convolution
An efficient convolution variant that decomposes a standard convolution into two steps – depthwise (per channel) and pointwise (1x1 convolution) – for 8-9x fewer computations.
Design Pattern
A design pattern is a reusable solution template for common software design problems (structure, behavior, collaboration).
Detokenization
The process of converting tokens back into readable text – the reverse of tokenization.
DETR (Detection Transformer)
A transformer-based model for object detection that predicts bounding boxes as set prediction without anchor boxes.
Devin
The first "AI software engineer" from Cognition Labs that can work on complex programming tasks autonomously over extended periods.
Dialogflow
Dialogflow is Google's cloud platform for building Conversational AI – with visual flow editors, NLU, and multi-channel deployment.
Dialogue Management
Component of a conversational AI system that controls the conversation flow.
Difference-in-Differences (DiD)
Quasi-experimental method that estimates causal effects by comparing changes over time between treatment and control groups.
Differential Privacy
A mathematically rigorous definition of privacy that guarantees an individual's participation in a dataset is statistically undetectable – even against attackers with arbitrary background knowledge.
Differentiation
Differentiation is creating perceived and real uniqueness that makes customers prefer your offering over alternatives.
Diffusion LLM
Language model that generates text not autoregressively token-by-token but in parallel via a denoising process – analogous to image diffusion models.
Diffusion Model
Diffusion models are generative AI models that learn to gradually remove noise from data to produce high-quality samples (images, audio, video).
Digital Farming
Digital Farming is a strategic framework that treats data as soil, technology as tools, and content as the harvest – an iterative, measurable, and sustainable approach to data-driven marketing.
Digital Transformation
The fundamental change of business processes, culture, and customer experiences through the integration of digital technologies in all areas of a company.
Digital Twin
A real-time virtual representation of a physical system, process, or product that is continuously updated through sensor data.
Dijkstra's Algorithm
Dijkstra's algorithm computes the shortest path distances from a single source node to all other nodes in a weighted graph with non-negative edge weights.
Dilated Convolution (Atrous Convolution)
Dilated Convolution expands the receptive field of a filter by inserting gaps between filter values – larger context without more parameters.
Dimensionality Reduction
Techniques for reducing the number of features while preserving important information.
Disaster Recovery
Strategies and processes for restoring critical systems and data after catastrophic events like hardware failures, cyberattacks, or natural disasters.
Disclosure UX
Disclosure UX is the set of interface patterns that transparently communicate important system facts to users (e.g., AI involvement, limitations, data use, confidence, and provenance).
Disparate Impact
A legal concept: A seemingly neutral rule or practice that disproportionately negatively affects a protected group.
Display Advertising
Online advertising with visual elements such as banners, videos, or interactive formats.
Disruption
Disruption is a market shift where new technologies or business models reshape customer expectations and cost structures, often displacing incumbents.
Distributed Training
Distributed training distributes ML training across multiple GPUs or machines – necessary for models that don't fit on a single GPU.
Distribution Shift
A change in statistical distribution between training and production data that degrades model performance.
Diversity in Recommendations
Strategies for increasing variety in recommendation lists to avoid filter bubbles and improve user satisfaction.
DMP (Data Management Platform)
A platform for collecting, organizing, and activating audience data for marketing.
Document AI
AI systems for intelligent processing and analysis of documents.
Double Machine Learning (DML)
Causal inference method that uses ML models to flexibly control for confounding while enabling valid statistical inference.
DP-SGD (Differentially Private SGD)
A training algorithm integrating Differential Privacy into Stochastic Gradient Descent – through gradient clipping and calibrated noise.
DPO (Direct Preference Optimization)
A simplified alternative to RLHF that optimizes models directly on preference data, without separate reward model or RL training.
DPO (Direct Preference Optimization)
A simplified alternative to RLHF that directly embeds human preferences into model weights without training a separate reward model – simpler, more stable, and cheaper.
DreamBooth
A fine-tuning method that personalizes diffusion models with just a few images (3-5) of a subject to generate it in arbitrary contexts.
DROP (Discrete Reasoning Over Paragraphs)
A reading comprehension benchmark that requires numerical reasoning over text passages (counting, sorting, arithmetic).
Dropout
A regularization technique that randomly deactivates neurons during training.
DSP (Demand-Side Platform)
A platform through which advertisers programmatically buy ad inventory.
DVC (Data Version Control)
Open-source tool for data and model versioning that extends Git workflows to ML artifacts.
Dynamic Batching
Grouping multiple inference requests together at runtime to improve throughput and reduce cost per request.
Dynamic Creative
Automatic adaptation of ad creatives based on audience, context, or performance data.
Dynamic Creative Optimization (DCO)
AI technology that assembles ad creatives in real-time from modular components and optimizes for each user.
Dynamic Pricing
Algorithm-based price adjustment in real-time based on demand and other factors.
E
E5 Embedding
E5 is a family of embedding models from Microsoft Research created through text-to-text contrastive training.
Early Stopping
Regularization technique that stops training when validation loss increases.
Economics of AGI
Research and discourse field on macroeconomic effects of artificial general intelligence on labor, productivity, and value creation.
Edge AI
AI processing that happens on local devices (edge) rather than in the cloud, for low latency and privacy.
Edge Computing
Data processing close to the data source instead of in central clouds.
Edge MLOps
MLOps practices specifically for deploying, monitoring, and updating ML models on edge devices and embedded systems.
Effect Size
Quantifies the strength of a difference or relationship – independent of sample size, unlike the p-value.
ELBO (Evidence Lower Bound)
ELBO is the lower bound on the log-likelihood in variational inference – the central objective function for VAEs and diffusion models.
Elo Rating
A rating system for measuring relative abilities, originally from chess – now standard for LLM leaderboards.
ELT
ELT (Extract, Load, Transform) is a data integration paradigm where raw data is first loaded into a data warehouse and then transformed there.
ELU (Exponential Linear Unit)
An activation function that exponentially dampens negative values toward a negative saturation value – smoother than ReLU with zero-mean outputs.
Email Marketing
Direct marketing via email for customer communication and sales promotion.
Embedding
An embedding is a dense vector representation of discrete data (words, images, users, products) where semantically similar objects lie close together in vector space.
Embedding Model
Specialized AI model that converts text, images, or audio into numerical vectors that make semantic similarity measurable.
Embedding Models
Specialized models that convert text, images, or other data into dense vectors that capture semantic meaning and enable similarity search.
Embeddings
Vector representations of data (words, sentences, images) in a lower-dimensional space that capture semantic similarity.
Embodied Reasoning (ER)
A multimodal model's ability to reason about the physical world – geometry, affordances, causality – instead of merely classifying pixels.
Emergent Abilities
Capabilities that suddenly appear in LLMs only above a certain model size, without being observable in smaller models.
Emotion Recognition
Emotion Recognition detects emotional states (joy, anger, sadness) from speech, facial expressions, or text – with focus on audio-based analysis.
Encapsulation
A programming concept that bundles data and the methods that access it into a single unit (class/module) and restricts direct access from outside.
Encoder
The part of a model that transforms input data into a compressed representation.
Encoder-Decoder
Architecture that encodes input into a representation and decodes output from it.
Encryption
Encryption transforms plaintext into ciphertext using a key, so only authorized parties can recover the original information.
Encryption at Rest
Encryption at rest protects stored data (databases, disks, backups, object storage) by encrypting it when not actively being transmitted or processed.
Encryption in Transit
Encryption in transit protects data while it moves across networks, commonly implemented using TLS (e.g., HTTPS).
Endpoint
A URL where an API service is accessible and receives requests.
Energy-Based Model (EBM)
Energy-based models assign energy values to data points – low energy for likely data, high for unlikely – and generate by energy minimization.
Engagement Rate
The percentage of users who interact with content relative to total reach.
Ensemble Learning
Combining multiple models to achieve better predictions than any single model alone.
Entity Extraction
The automatic identification and classification of named entities in text.
Entity Linking
Entity Linking is the process of mapping text mentions of entities to unique entries in a knowledge base (e.g., Wikidata).
Entity Resolution
Entity resolution is the process of identifying, matching, and merging multiple records from different sources that refer to the same real-world entity (person, company, product) — even when spellings, IDs, or fields are not identical.
Envelope Encryption
Envelope encryption encrypts data with a short-lived data key, then encrypts that data key with a longer-lived master key (often in KMS/HSM).
Episodic Memory (Agent Memory Layer)
Persistent, searchable memory layer where an agent stores events, decisions, and user preferences across sessions – beyond the context window.
Epistemic vs. Aleatoric Uncertainty
Epistemic uncertainty arises from lack of knowledge (reducible with more data); aleatoric uncertainty is inherent noise in data (irreducible).
Epoch (Machine Learning)
In machine learning, an epoch refers to one complete pass of a learning algorithm through the entire training dataset — i.e. the moment in which every training example has been used exactly once to update the model weights.
Equalized Odds
Fairness criterion: A model satisfies equalized odds when True Positive Rate and False Positive Rate are equal across all protected groups.
Error Analysis
Systematic examination of model errors to identify patterns and improvement opportunities.
Error Rate
Error rate is the proportion of outcomes that are incorrect relative to a defined ground truth or acceptance criteria.
ETL (Extract, Transform, Load)
Extract, Transform, Load – the process of extracting data, transforming it, and loading it into target systems.
EU AI Act
The world's first comprehensive legal regulation for Artificial Intelligence, adopted by the EU Parliament in 2024, establishing risk-based requirements for AI systems.
EU AI Act
EU Regulation 2024/1689 that regulates AI systems by risk class and is progressively applicable from 2026.
Euclidean Distance
Geometric distance between two points in vector space.
Eval Framework
Systematic framework for evaluating LLM outputs against defined criteria like correctness, relevance, safety, and style.
Evaluation Harness
A framework for systematically evaluating model performance across various metrics and test cases.
Event Tracking
The capture and analysis of user interactions and actions on digital platforms.
Event-Driven Architecture
Software architecture where components communicate through events.
Exit Rate
The percentage of visitors who leave a website from a specific page.
Expected Calibration Error (ECE)
The standard metric for measuring classifier calibration quality – the weighted average of the difference between confidence and accuracy across bins.
Experiment Tracking
Systematic logging and management of ML experiments.
Explainability
The ability to make an AI model's decisions or predictions understandable to humans.
Explainability UX Patterns
Explainability UX patterns are interface patterns that help users understand why an AI system produced an output, what evidence it used, and what actions it took (or refused).
Explainable AI (XAI)
Explainable AI (XAI) comprises methods and product practices that make AI outputs understandable, traceable, and auditable.
Exploration vs. Exploitation
The fundamental RL dilemma: Should the agent exploit known good actions (exploitation) or explore new options (exploration)?
Exploratory Data Analysis
The process of visually and statistically examining data before model building.
Exponential Backoff
Exponential backoff increases the wait time between retries exponentially after each failure (e.g., 100ms → 200ms → 400ms → 800ms…).
Exponential Growth
A growth pattern where a quantity grows proportionally to its current value, leading to doubling in constant time intervals.
Exponential Moving Average (EMA)
Technique that maintains an exponentially weighted average of model weights over training – the EMA model often generalizes better than the final model.
Exponential Smoothing
A family of statistical time series methods that exponentially weights current observations more heavily than past ones.
F
F1 Score
The harmonic mean of precision and recall, a single metric that balances both aspects of classification performance.
Facebook Ads
Meta's advertising platform for paid ads on Facebook and Instagram.
Fairness
The goal that AI systems treat all groups equitably and don't cause systematic discrimination.
FAISS
An open-source library from Meta for efficient similarity search and clustering of dense vectors – the standard for local vector indices.
Faithfulness
How accurately an LLM output corresponds to the provided sources and instructions.
FastText
Facebook's open-source library for efficient text classification and word embeddings with sub-word information.
Fault Tolerance
Fault tolerance is a system's ability to continue operating correctly (or degrade safely) when components fail.
Feature Engineering
The process of selecting, transforming, and creating input variables (features) for machine learning models to improve their predictive power.
Feature Extraction
The process of automatically deriving relevant features from raw data.
Feature Importance
Feature importance quantifies how much each input feature contributes to a model's predictions (globally or for a specific prediction).
Feature Store
A central infrastructure for managing, storing, and serving ML features across training and serving.
Federated Learning
A decentralized training approach where models are trained locally on many devices, and only model updates (not raw data) are sent to a central server – training without data centralization.
Feed-Forward Network (FFN)
In the Transformer context: a two-layer MLP applied independently to each position after the attention layer.
Feedback Loop
A system where outputs are fed back to influence future inputs or decisions.
Few-Shot Learning
A technique where the model is given few examples in the prompt to demonstrate the desired output format or task.
Fiddler AI
An enterprise platform for model performance management that helps companies launch and update AI models faster by automatically detecting issues and improving efficiency.
Fine-Tuning
Adapting a pre-trained model to a specific task by further training it on task-specific data.
Finite State Machine (FSM)
A mathematical model of computation that is in exactly one of a finite number of states and transitions between these states based on inputs.
FinOps
A discipline for managing cloud costs that brings together engineering, finance, and business to make data-driven decisions about cloud spending.
FinOps for AI
FinOps for AI applies financial operations practices (cost visibility, optimization, budgeting, accountability) to AI workloads and AI product usage.
Fireworks AI
High-performance inference platform for generative AI with focus on fast, cost-effective model deployment.
First-Party Data
Data collected directly from own customers and users.
First-Party Data AI
Strategic approach of using proprietary customer data as a differentiation layer on top of generic foundation models.
Flash Attention
An optimized implementation of the attention mechanism that reduces memory access and maximizes GPU efficiency through tiling and kernel fusion.
Flow Matching
Flow matching is a generative modeling technique that learns straight transport paths between noise and data distributions – faster and more stable than classical diffusion.
Flux
A new open-source image generation model from Black Forest Labs (ex-Stability AI) that competes in quality with Midjourney.
Focal Loss
Modified cross-entropy loss that up-weights hard-to-classify examples and down-scales easy examples.
Focus Group
A qualitative research method with a small group for in-depth discussions.
Forward Chaining
An inference strategy that starts from known facts and applies rules to derive new facts until the goal is reached.
Forward Pass
Computing the model output by forward propagating through all layers.
Foundation Model
A large model pre-trained on broad data that can be adapted for many downstream tasks.
Fraud Detection
AI-powered detection of fraudulent activities and transactions.
Frequency Capping
Limiting the number of times an ad is shown to a user.
FSDP (Fully Sharded Data Parallel)
PyTorch's native implementation of parameter sharding – distributes model parameters, gradients, and optimizer states across GPUs for memory-efficient training.
Full-Stack
Development that encompasses both frontend and backend of an application.
Function Calling
The ability of LLMs to call external functions in a structured way – the model decides which function with which parameters, execution happens externally.
Function Calling (LLM)
Function Calling enables LLMs to generate structured function calls – the bridge between natural language and APIs, databases, or external tools.
Funnel Analysis
Analysis of conversion rates through the various stages of a customer journey.
Fuzzy Inference System
A fuzzy inference system uses fuzzy logic rules to map inputs to outputs when concepts are imprecise (e.g., "high risk," "medium demand").
Fuzzy Matching
Techniques for finding approximate rather than exact matches in data.
G
G-Eval
An LLM evaluation framework that uses chain-of-thought reasoning and weighted probabilities for more nuanced scoring.
Gaussian Distribution
A symmetric probability distribution, also known as normal distribution.
Gaussian Mixture Model (GMM)
A probabilistic model representing data as a mixture of Gaussian distributions.
GDPR
The EU General Data Protection Regulation (since 2018), establishing uniform rules for processing personal data by companies and granting individuals comprehensive rights.
GDPR AI
The application of GDPR principles to AI systems, especially in automated decision-making and profiling.
GELU (Gaussian Error Linear Unit)
A smooth activation function that weights inputs by their cumulative normal distribution probability – standard in BERT, GPT-2, and many Transformers.
Gemini
Google's multimodal AI model – natively built for text, image, audio, video, and code, not retrofitted together.
Gemini 3.1 Pro
Google's 2026 flagship LLM with natively multimodal architecture and 2M-token context.
Gemma 4
Open-weight model family by Google for on-device and edge inference, ranging from 2B to 27B parameters.
Generalization
A model's ability to perform well on new, unseen data.
Generative Adversarial Network (GAN)
Architecture with two competing networks for generating realistic data.
Generative AI
AI models that create new content – text, images, audio, code, or structured data.
Generative Engine Optimization (GEO)
Optimization of content for visibility in generative AI search engines like ChatGPT, Perplexity, and Google AI Mode.
GEO (Generative Engine Optimization)
Generative Engine Optimization (GEO) is the strategic optimization of content, brand and data structure for generative AI search engines like ChatGPT, Perplexity, Google AI Overviews and Claude — with the goal of being both cited and actively used as answer source.
Geo-Targeting
Delivering content or ads based on user location.
GGUF (GPT-Generated Unified Format)
A file format for quantized LLM weights developed by llama.cpp that enables efficient inference on CPU and consumer GPUs.
GitHub Copilot
An AI coding assistant from GitHub/Microsoft that provides real-time code suggestions directly in the editor based on OpenAI models.
GloVe
GloVe (Global Vectors for Word Representation) is a word embedding method that uses global co-occurrence statistics of a text corpus to generate semantic word vectors.
Google Ads
Google Ads is Google's advertising platform for search, display, video, and app campaigns, using auction mechanisms to deliver ads.
Google AI Overviews
Google's AI-generated summaries at the top of search results – synthesized from multiple sources.
Google Analytics
A web analytics service by Google for measuring and analyzing website traffic.
Google Colab
Google Colab (Colaboratory) is a free, cloud-based Jupyter notebook environment with GPU/TPU access for machine learning and data analysis.
Google DeepMind
Google's merged AI research division, formed from DeepMind and Google Brain, responsible for Gemini and groundbreaking AI research.
Google Flow
Google's AI-powered creative platform for image generation and editing, using Nano Banana 2 as its default model.
Google Search Console
Free Google tool for monitoring and optimizing search presence.
Google Vertex AI
Google's unified ML platform on Google Cloud for training, deploying, and managing ML models with AutoML and custom training.
Governance
Governance is the set of roles, rules, processes, and controls that ensure a system is used responsibly and predictably—aligned with risk, compliance, and business objectives.
GPQA (Graduate-Level Google-Proof Q&A)
A benchmark with 448 expert-level questions from physics, biology, and chemistry, so difficult that even PhDs without expertise only achieve 30%.
GPQA Diamond
High-difficulty science benchmark with PhD-level questions in biology, physics, and chemistry.
GPT (Generative Pre-trained Transformer)
A family of large language models by OpenAI based on the Transformer architecture.
GPT Orchestration
Architectural approach connecting multiple specialized GPTs/LLMs with routing logic into complex workflows.
GPT-4
OpenAI's most advanced multimodal language model that can process text, images, and code, serving as the benchmark for LLM performance.
GPT-4V (Vision)
OpenAI's GPT-4 extension with image understanding – the breakthrough that taught ChatGPT to "see".
GPT-5
OpenAI's most advanced language model (2026), combining multimodal processing, enhanced reasoning, and native tool use in one model.
GPT-5.4
OpenAI's 2026 flagship LLM with thinking mode, multimodal processing, and agent-native architecture.
GPU (Graphics Processing Unit)
Specialized processor for parallel computations, ideal for AI training.
GQA (Grouped-Query Attention)
An attention variant where multiple Query heads share a single Key-Value pair to reduce KV-Cache size and memory consumption.
Grad-CAM (Gradient-weighted Class Activation Mapping)
XAI method that generates heatmaps showing which image regions a CNN considers most important for its decision.
Gradient Accumulation
Gradient accumulation sums gradients over multiple mini-batches before an optimization step – simulates larger batch sizes without more GPU memory.
Gradient Centralization (GC)
Simple technique that subtracts the mean of gradients before applying them to weights – improves generalization at zero cost.
Gradient Checkpointing
Gradient checkpointing saves GPU memory by discarding intermediate activations and recomputing them during the backward pass – trades compute for memory.
Gradient Clipping
Gradient clipping limits the norm or value of gradients during training to prevent exploding gradients.
Gradient Descent
An optimization algorithm that iteratively adjusts parameters in the direction of steepest descent of the loss function.
Gradient Noise
The natural noise in gradient estimates from mini-batch sampling – acts as implicit regularization and helps find better minima.
Graph Attention Network (GAT)
Graph Attention Networks use attention mechanisms during message passing to automatically learn which neighbor nodes are more important.
Graph Classification
The task of assigning an entire graph to a class based on its structure and node properties.
Graph Convolutional Network
A GNN variant that generalizes convolution operations to graphs to learn node representations.
Graph Database
A graph database stores data as nodes (entities) and edges (relationships), optimized for queries over connected structures.
Graph Isomorphism Network
A GNN with maximum discriminative power among message-passing architectures, theoretically grounded by the Weisfeiler-Leman test.
Graph Neural Network
A class of neural networks that operate directly on graph structures, learning node, edge, and graph-level properties.
Graph Search
Graph search is the process of exploring a graph to find a target node, a path, or an optimal solution under a defined objective (e.g., shortest path, lowest cost).
Graph Transformer
Graph Transformers combine Transformer architectures with graph structures, applying self-attention directly on graph nodes.
Graph Traversal
Graph traversal is systematically visiting nodes and edges in a graph (e.g., using BFS or DFS) to explore structure or find targets.
GraphSAGE
An inductive GNN framework that learns scalable node representations by sampling and aggregating neighborhoods.
Great Expectations
Open-source framework for data validation, documentation, and profiling with a declarative expectation system.
Greedy Algorithm
An algorithm that makes the locally optimal choice at each step.
Greedy Best-First Search
Greedy Best-First Search expands the node that appears closest to the goal using only a heuristic score h(n), ignoring the cost accumulated so far.
Greedy Decoding
A decoding strategy that always selects the token with the highest probability – deterministic, but often repetitive.
Grid Search
Hyperparameter tuning method that systematically tries all combinations of a predefined parameter space.
Griffin (Google)
Google's hybrid architecture combining linear recurrences (gated RNN) with local attention, productionized in RecurrentGemma.
Grok
xAI's LLM with real-time access to X (Twitter), known for humorous, uncensored style and current information.
Groq
AI inference platform with proprietary LPU hardware (Language Processing Unit) enabling extremely fast token generation.
Ground Truth
The actual, correct data or labels used as reference for model training and evaluation.
Grounding
Techniques for anchoring LLM outputs in verifiable sources – the model explicitly references documents, data, or facts rather than generating freely.
Group Normalization
Group Normalization divides channels into groups and normalizes within each group – works batch-independently and is ideal for small batch sizes.
Growth Hacking
Experimental marketing strategies focused on rapid, cost-effective growth.
GRPO (Group Relative Policy Optimization)
GRPO is an RL alignment method that works without a separate reward model – instead, groups of responses are evaluated relative to each other.
GRU (Gated Recurrent Unit)
A simplified RNN architecture with gates to control information flow.
GRU (Gated Recurrent Unit)
GRU is a simplified RNN architecture with update and reset gates – fewer parameters than LSTM with comparable performance.
GSM8K
A benchmark with 8,500 grade-school math problems that require multi-step reasoning.
Guardrails
Mechanisms and systems that monitor, filter, and correct AI outputs to ensure they stay within defined boundaries for safety, ethics, and brand guidelines.
Guardrails (AI)
Mechanisms for constraining and validating AI outputs – prevents toxic, incorrect, or off-brand content and uncontrolled agent actions.
Guidance Scale
Guidance scale is a parameter (commonly in classifier-free guidance) that controls how strongly a diffusion model follows the text prompt versus generating more diverse outputs.
H
Hallucination (AI)
The phenomenon where AI models generate plausible-sounding but factually incorrect or fabricated information that was not contained in the training data.
Hallucination Detection
Methods and tools for detecting "hallucinations" – false or fabricated information that LLMs present as facts with high confidence.
Hallucination Rate
The percentage of AI-generated outputs containing information not supported by facts or sources.
Hardware Security Module (HSM)
An HSM is a tamper-resistant hardware device that securely generates, stores, and uses cryptographic keys.
Hash Function
A function that maps input data to a fixed output value, ideally collision-free.
Hash Table
A hash table maps keys to values using a hash function, enabling average-case O(1) lookups, inserts, and deletes.
Headless CMS
A content management system that delivers content via APIs without a fixed frontend.
Heatmap
A visual representation of data where values are encoded by color intensity.
HellaSwag
A benchmark for common-sense reasoning where LLMs must choose the most plausible continuation of a scenario.
HELM (Holistic Evaluation of Language Models)
A comprehensive evaluation framework from Stanford that assesses LLMs on dozens of dimensions like accuracy, fairness, robustness, and efficiency simultaneously.
Hero Section
The prominent visual area at the top of a webpage featuring the main message and call-to-action.
Heterogeneous Graph
A graph with different types of nodes and/or edges, modeling various entity types and relationships.
Heuristic
A heuristic is a practical scoring rule or estimate that guides search or decision-making toward promising options without guaranteeing optimality.
Heuristic Search
Heuristic search is a family of search algorithms that use a heuristic (a guiding estimate) to explore a problem space more efficiently than uninformed search.
High Availability
A system design approach that ensures continuous operation and minimal downtime, typically through redundancy and automatic failover.
High-Level Representation
A high‑level representation abstracts raw data into more meaningful structures (symbols, concepts, latent variables, or summaries).
Hit Rate
Measures the proportion of queries for which at least one relevant result was found in the top-k – often as Recall@1.
HNSW
Hierarchical Navigable Small World – a graph-based algorithm for efficient approximate nearest neighbor search.
HNSW Index
HNSW (Hierarchical Navigable Small World) is an approximate nearest neighbor (ANN) indexing method that uses layered graph structures to enable fast similarity search in high-dimensional vector spaces.
Hold-Out Validation
Simplest evaluation method: dataset is split once into training and test set (e.g., 80/20).
Homomorphic Encryption
A cryptographic method that enables computations directly on encrypted data without decrypting it first.
Horizontal Scaling
Increasing capacity by adding more machines rather than upgrading individual systems.
HTTPS (Hypertext Transfer Protocol Secure)
HTTPS is HTTP over TLS, providing encrypted transport, integrity, and server authentication for web communication.
HuBERT
HuBERT (Hidden-Unit BERT) is a self-supervised speech model from Meta that learns high-quality speech representations by predicting discretized audio clusters.
Hugging Face
The leading open-source platform for machine learning, functioning as the "GitHub for AI" and hosting over 500,000 models.
Hugging Face Tokenizers
High-performance Rust-based tokenizer library by Hugging Face with BPE, WordPiece, and Unigram support.
Human Evaluation
The evaluation of AI outputs by human annotators – the gold standard for quality measurement, but expensive and slow.
Human-in-the-Loop
Design principle where humans are involved at critical points in automated AI workflows to validate, correct, or approve decisions.
HumanEval
A benchmark for code generation with 164 Python programming tasks, evaluated by Pass@k (code must pass tests).
Hybrid AI System
A hybrid AI system combines multiple AI paradigms—typically symbolic/rule-based methods with statistical/ML models (including LLMs).
Hybrid Recommender System
A recommendation system combining multiple approaches (collaborative filtering, content-based, knowledge-based) for better recommendation quality.
Hybrid Search
A search method that combines lexical search (BM25/keyword) with semantic search (embeddings) to leverage the strengths of both approaches.
Hyena
A subquadratic attention replacement based on long convolutions and data-controlled gates, scaling O(N log N) instead of O(N²).
Hyper-Personalization
The next level of personalization: AI uses real-time data and context for ultra-individual experiences at every moment.
Hyperparameter
Configuration settings chosen before training that influence how a model learns.
Hyperparameter Optimization
The systematic process of finding the best hyperparameter settings for an ML model.
Hypothesis Generation
Hypothesis generation is producing candidate explanations (or candidate solutions) that could plausibly account for observed evidence.
Hypothesis Testing
Hypothesis testing is a class of statistical procedures used to evaluate whether a claim about a population (alternative hypothesis), based on sample data, is statistically defensible compared with a default assumption (null hypothesis).
I
Ideal Customer Profile
A detailed description of the ideal customer for a product or service.
Identity
Identity is the representation of a principal (user, service, device) that can be authenticated and authorized in a system.
Identity and Access Management (IAM)
IAM is the set of processes and systems that manage identities and control their access to resources (authentication + authorization + governance).
Identity-Preference Optimization
An alignment method that extends DPO for more stable training.
Ideogram
A text-to-image model that excels at outstanding text rendering capabilities in generated images.
IFEval (Instruction Following Evaluation)
A benchmark that tests how well LLMs follow explicit format instructions (e.g., "Answer in exactly 3 paragraphs", "Start each sentence with a capital letter").
Image Captioning
Automatic generation of text descriptions for images.
Image Classification
Assigning an entire image to one or more predefined categories using a machine learning model.
Image Generation
Image generation is the automatic creation of images by AI models based on text prompts, other images, or other inputs.
Image Segmentation
Dividing an image into meaningful regions or objects at the pixel level.
Image Understanding
AI's ability to not just recognize objects but understand the semantic context and meaning of images.
Image-to-Image
Models that transform an input image into a modified or transformed output image.
Image-to-Image (img2img)
Image-to-image transforms an input image based on a text prompt and a denoise strength parameter – from subtle changes to complete redesign.
Image-to-Text
AI generation of natural language descriptions for images – from simple captions to detailed analyses.
Image-to-Video
AI technology that transforms static images into moving videos by adding realistic animation, camera movement, and scene development.
ImageBind
Meta's multimodal embedding model that unifies six modalities (image, text, audio, video, depth, thermal) in a shared vector space.
Imitation Learning
An ML approach where an agent learns by observing and imitating expert behavior.
Implicit Feedback
User signals derived from behavior (clicks, dwell time, purchases) rather than explicit ratings.
Impression
Single display of an ad or piece of content.
In-Context Learning
The ability of LLMs to learn from the context of the prompt without changing model weights – the foundation of modern prompting techniques.
Incident Response
Structured processes and procedures for detecting, analyzing, containing, and remediating security incidents or system outages.
Incrementality
The causal, additional effect of a marketing action beyond what would have happened anyway.
Indexing (SEO)
The process by which search engines discover, crawl, and add web pages to their database.
Inductive Reasoning
A form of logical inference where general rules or patterns are derived from specific observations—the conclusion is probable but not guaranteed.
Inference
The process of applying a trained AI model to new inputs to generate predictions or outputs.
Inference Engine
The core component of an expert system that applies logical rules to a knowledge base to derive new facts or make decisions.
Inference Optimization
The collection of all techniques for accelerating and improving efficiency of LLM inference, including quantization, batching, caching, and hardware optimization.
Inference-Time Compute
A technique where AI models use additional compute time during response generation (inference) to achieve better results through longer "thinking."
Influencer Marketing
Marketing through collaboration with individuals who have an engaged follower base.
Information Extraction
Automatically extracting structured information (entities, relations, facts) from unstructured text.
Information Hiding
A software design principle that hides internal implementation details of a module from other parts of the system to localize changes.
Information Retrieval
Finding relevant documents or information from a large collection.
Innovation
The introduction of new ideas, methods, products, or processes that create value and improve or replace existing solutions.
Inpainting
Filling in missing or masked regions of an image with plausible content.
Insights
Insights are meaningful interpretations of data that reduce uncertainty and enable better decisions (descriptive, diagnostic, predictive, or prescriptive).
Instance Normalization
Instance Normalization normalizes each feature map (channel) of each sample individually – standard in style transfer and image generation.
Instruction Tuning
A fine-tuning method where models are trained on (instruction, response) pairs to follow natural language instructions – the step that turns base models into helpful assistants.
Instructor Embedding
An embedding model that uses task-specific instructions in the prompt to optimize embeddings for different tasks.
Instrumental Variable (IV)
A variable that influences the treatment variable but affects the outcome only through the treatment – not directly. Enables causal estimates despite confounding.
Integrated Gradients
XAI method that computes feature attributions by integrating gradients along a path from a baseline to the actual input.
Integration Testing
Tests that verify the interaction between multiple components or systems.
Intelligent Tutoring System
An Intelligent Tutoring System (ITS) is an AI-driven learning system that personalizes instruction, feedback, and practice to a learner's needs.
Intent Classification
Determining the intention or goal behind a user query.
Intent Recognition
AI capability to recognize the intent behind a user utterance.
Inter-Annotator Agreement (IAA)
A metric for measuring the agreement between different annotators when evaluating the same data.
Interface
An Interface defines a contract between system components – the methods, properties, or protocols through which they communicate without exposing internal details.
Interpretability
The degree to which humans can understand how a model arrives at its decisions.
Interpretable Machine Learning
ML models that are inherently understandable – their decision logic can be directly inspected without additional explanation methods.
Inverse Reinforcement Learning (IRL)
IRL learns the reward function from observed expert behavior – instead of specifying a reward function, it is inferred from demonstrations.
iOS 27 Siri
The Siri generation deeply integrated with ChatGPT in iOS 27, acting as a personal on-device agent.
IoU (Intersection over Union)
A metric measuring the overlap between a predicted and ground truth region, calculated as intersection divided by union.
IP-Adapter
IP-Adapter enables image prompts for diffusion models – a reference image controls style, composition, or face identity of the generation.
Iterative Deepening
Iterative deepening is a search strategy that repeatedly runs depth-limited search with increasing depth limits until it finds a solution or exhausts a budget.
Iterative Prompting
A prompting approach that refines results through multiple successive prompts.
J
Jaccard Similarity
A similarity measure between two sets, defined as the size of the intersection divided by the size of the union.
Jailbreak
Techniques that bypass LLM safety measures to produce unwanted or harmful outputs.
Jailbreaking
Techniques aimed at bypassing safety measures and ethical restrictions of AI models.
Jamba
AI21 Labs' hybrid architecture combining Transformer attention with Mamba SSM layers and MoE for efficient long contexts.
JAX
JAX is Google's high-performance framework for numerical computing and machine learning that combines NumPy syntax with automatic differentiation and GPU/TPU acceleration.
Jevons Paradox
The Jevons Paradox states that technological progress increasing the efficiency of a resource often leads to higher, not lower, overall consumption of that resource – because falling costs disproportionately increase demand.
JIT Compilation
Just-In-Time compilation translates code to machine code at runtime for better performance.
Jitter
Jitter adds randomness to retry delays so many clients don't retry at the same time.
Job Scheduling
Planning and executing tasks at specific times or based on events.
Joint Distribution
The probability distribution describing the probability of combinations of multiple random variables.
Journey Mapping
Visualizing all touchpoints and experiences a customer has with a brand.
JSON Mode
A model mode that guarantees the output is valid JSON.
JSON Schema
A vocabulary for annotating and validating JSON documents.
JSON Web Token
A compact, URL-safe token standard for securely transmitting claims between parties.
JSON-LD
A format for serializing Linked Data using JSON syntax.
Judge LLM
An LLM used to evaluate and rank outputs from other LLMs.
Jupyter Notebook
An interactive computing environment that combines code, visualizations, and text in one document.
K
K-Anonymity
K-anonymity is a privacy property where each record in a dataset is indistinguishable from at least k−1 other records with respect to quasi-identifiers.
K-Armed Bandit
The k-armed bandit problem models choosing among k options to maximize reward while balancing exploration vs exploitation.
K-Fold Cross-Validation
K-fold cross-validation is an evaluation method where data is split into k parts; the model trains on k−1 folds and is tested on the remaining fold.
K-Fold Cross-Validation
Cross-validation variant that splits the dataset into k equal parts and trains k models.
K-Means Clustering
K-means is an unsupervised algorithm that partitions data into k clusters by minimizing within-cluster distance to cluster centroids.
K-Means++
K-means++ is an initialization method for k-means that chooses starting centroids to improve convergence and cluster quality.
K-Shot Prompting
K-shot prompting provides k examples in the prompt to guide the model's behavior (format, reasoning pattern, tone).
Kafka
Apache Kafka is a distributed event streaming platform used to publish, store, and process event streams at scale.
Kalman Filter
A Kalman filter is an algorithm for estimating the hidden state of a system over time from noisy measurements.
Kaplan-Meier Estimator
The Kaplan–Meier estimator estimates a survival function (probability of "not yet churned" over time), handling censored data.
Kernel (ML)
In ML, a kernel is a function that measures similarity between data points, enabling algorithms to operate in implicit high-dimensional feature spaces.
Kernel Trick
The kernel trick allows algorithms to compute dot products in an implicit higher-dimensional space without explicitly transforming the data.
Key Management
Key management is the lifecycle management of cryptographic keys: generation, storage, access control, rotation, revocation, and auditing.
Key Management Service (KMS)
KMS is a managed service for creating, storing, rotating, and auditing cryptographic keys (often with HSM-backed options).
Key Rotation
Key rotation is the practice of regularly replacing cryptographic keys to reduce exposure if a key is compromised.
Keyword Cannibalization
Keyword cannibalization occurs when multiple pages on a site compete for the same query intent, reducing ranking clarity and performance.
Keyword Difficulty
Keyword difficulty is an estimate of how hard it is to rank for a keyword, typically based on competition and backlink strength.
Keyword Research
Keyword research is identifying and prioritizing the queries people use, then mapping them to content that satisfies intent.
Kling AI
Kuaishou's Chinese text-to-video model that competes with Sora and generates realistic videos up to 2 minutes.
KMS (Key Management Service)
A Key Management Service is a managed system for creating, storing, rotating, and controlling access to cryptographic keys.
KNN (k-Nearest Neighbors)
KNN is a method that predicts outcomes based on the k most similar examples in a dataset.
KNN Search
KNN search retrieves the k closest vectors to a query vector under a distance metric.
Knowledge Base (KB)
A knowledge base is a curated repository of information (articles, FAQs, policies) designed for retrieval and reuse.
Knowledge Cutoff
Knowledge cutoff is the point in time after which a model's training data does not include new information.
Knowledge Distillation
A technique where a smaller "student" model is trained to imitate the behavior of a larger "teacher" model, transferring knowledge.
Knowledge Distillation
A technique for transferring knowledge from a large, complex "teacher" model to a smaller, more efficient "student" model that achieves similar performance with lower resource consumption.
Knowledge Graph
A structured representation of knowledge as a graph with entities (nodes) and relationships (edges).
Knowledge Graph Embedding
Knowledge Graph Embeddings learn low-dimensional vector representations for entities and relations of a Knowledge Graph.
Knowledge Tracing
Knowledge tracing models a learner's evolving mastery of skills over time using their interactions (answers, attempts, time, hints).
KPI (Key Performance Indicator)
A KPI is a metric selected to measure progress toward a business objective (revenue, pipeline, activation, retention, cost).
KPI Tree
A KPI tree is a structured decomposition of a top-level KPI into contributing drivers and sub-metrics.
KServe
Kubernetes-native model serving framework (formerly KFServing) for standardized, scalable ML inference on Kubernetes.
KTO (Kahneman-Tversky Optimization)
An alignment method that only needs binary feedback (good/bad) instead of pairwise preferences, inspired by Prospect Theory.
Kubeflow
Kubernetes-native open-source platform for deploying, scaling, and managing ML workflows.
Kubernetes (K8s)
Kubernetes is a container orchestration platform for deploying, scaling, and managing containerized applications.
KV Cache (Key-Value Cache)
A caching mechanism that stores the Key and Value tensors of attention layers to avoid redundant computations during autoregressive generation.
L
L1 Regularization (Lasso)
L1 regularization adds a penalty proportional to the absolute value of model weights, encouraging sparsity (many weights become exactly zero).
L2 Regularization (Ridge)
L2 regularization adds a penalty proportional to the square of model weights, encouraging smaller weights without forcing exact zeros.
Label Leakage
Label leakage describes the situation in which a machine-learning model's training dataset contains features that carry direct or indirect information about the target variable (the label) — information that simply would not be available at inference time in production.
Label Smoothing
Label smoothing is a training technique that replaces hard labels (0 or 1) with slightly softened targets (e.g., 0.9 and 0.1).
Label Studio
Open-source platform for data annotation and labeling supporting text, images, audio, video, and multi-modal data.
LAMB (Layer-wise Adaptive Moments for Batch Training)
Optimizer for extremely large batch sizes (up to 64K+) that adapts learning rates per layer, enabling stable training with massive parallelization.
Landing Page
Specially designed destination page for marketing campaigns with clear CTA.
Landing Page Optimization (LPO)
Landing page optimization is improving a landing page to increase desired outcomes (signups, demos, purchases).
LangChain
An open-source framework for building LLM applications – provides abstractions for chains, agents, memory, retrieval, and tool integration.
LangGraph
A framework by LangChain for building stateful multi-agent workflows as graphs with nodes (agents) and edges (transitions).
Language Model (LM)
A language model is a model that estimates the probability of sequences of tokens, enabling tasks like prediction, generation, and scoring.
Large Language Model (LLM)
A large neural network trained on vast amounts of text to understand and generate human-like text.
Large Language Model (LLM)
A large neural network trained on massive amounts of text that can understand and generate human-like text.
LARS (Layer-wise Adaptive Rate Scaling)
Optimizer that combines SGD with layer-wise learning rate adaptation – enables stable training with large batch sizes for computer vision.
Last-Click Attribution
Last-click attribution assigns 100% of conversion credit to the last touchpoint before conversion.
Late Interaction
A retrieval paradigm where query and document tokens are encoded independently but interact via token-level similarity only at search time.
Latency
The time between request and response in a system.
Latency Budget
A latency budget is an explicit allocation of maximum allowed time for each system component to meet an overall SLA.
Latent Diffusion
Latent diffusion performs the diffusion process in compressed latent space instead of pixel space – 10-100x faster with comparable quality.
Latent Space
A compressed, lower-dimensional space where a model stores internal representations of data.
Latent Variable
A latent variable is an unobserved variable inferred from observed data, used to explain hidden structure.
Layer
A Layer is an abstract level in a layered system that encapsulates a specific function and communicates with other layers through defined interfaces.
Layer Dropping
A compression technique that removes entire transformer layers from a trained model – the simplest way to make an LLM smaller and faster.
Layer Normalization
Layer normalization is a technique that normalizes activations within a layer to stabilize and speed up training in deep networks.
Lead Generation
Lead generation is the process of identifying and attracting potential customers (leads) who show interest in a company's products or services.
Lead Lifecycle Stages
Lead lifecycle stages are standardized states a lead progresses through with defined entry/exit criteria.
Lead Scoring
Quantifying the likelihood that a lead will become a customer.
Leaky ReLU
A variant of ReLU that lets negative values pass with a small factor (e.g., 0.01) instead of setting them to 0 – prevents the dead neuron problem.
Learning Management System
A Learning Management System (LMS) is software for delivering, managing, and tracking training and learning content (courses, assignments, completion, assessments).
Learning Objectives
Learning objectives are clear, measurable statements of what a learner should be able to do after instruction.
Learning Rate
A hyperparameter that determines how much to adjust model weights at each training step.
Learning Rate Range Test
Diagnostic method that exponentially increases the learning rate while observing loss – finds the optimal LR range in a single training run.
Learning Rate Schedule
A learning rate schedule changes the learning rate over training (warmup, decay, cosine, step, exponential).
Learning Rate Warmup
Training technique that slowly ramps the learning rate from near zero to the target value in the first steps/epochs.
Learning Record Store (LRS)
A Learning Record Store (LRS) is a system that stores learning activity data—typically as xAPI statements—and enables reporting and analytics across learning experiences.
Learning to Rank (LTR)
ML approaches for learning optimal ranking functions for search results, recommendations, or feeds.
Least Privilege
Least privilege grants only the minimum permissions needed to perform a task—no more, no longer than necessary.
Lemmatization
Linguistically informed reduction of words to their base form (lemma) considering part of speech and context.
Length Penalty
Length penalty is a decoding adjustment that prevents generation algorithms (especially beam search) from unfairly preferring overly short sequences.
Leonardo AI
An AI image generation platform focused on gaming, concept art, and professional creative workflows.
Lexical Search
Lexical search retrieves documents based on exact words/terms (keyword matching), typically using inverted indexes and BM25.
Liability Target
Clearly defined entity (person, role, or organization) liable for an AI agent's decisions or damages.
LiDAR
A remote sensing technology that uses laser pulses to create precise 3D point clouds of the environment – the "3D eye" of autonomous vehicles.
Lifecycle Marketing
Lifecycle marketing is designing messaging and experiences across the customer lifecycle.
Lift
Lift is the incremental change in an outcome attributable to an intervention.
Lift Chart
A lift chart shows how well a model ranks positives by comparing outcomes across scored segments.
LIME (Local Interpretable Model-agnostic Explanations)
LIME (Local Interpretable Model-agnostic Explanations) explains an individual model prediction by fitting a simple, interpretable surrogate model around that specific input.
Linear Attention
Attention variants that reduce the quadratic O(N²) complexity to linear O(N) through kernel approximation or alternative computation order.
Link Equity
Link equity is the SEO value passed through links, influencing how authority and relevance flow across pages.
Link Graph
A link graph is the network of pages (nodes) connected by links (edges), both internally and externally.
Link Prediction
Link Prediction predicts which connections between nodes in a graph are likely to exist or will form.
Linting
Linting is automatically checking code (or structured content) for errors, style violations, and quality issues based on rules.
Lion (Evolved Sign Momentum)
Optimizer discovered by Google Brain through AutoML search that only uses the sign of gradients – simpler than Adam, often comparable results.
Lip Sync AI
AI technology that automatically adjusts lip movements in videos to new audio tracks so spoken words look natural.
LiveCodeBench
Contamination-free coding benchmark that continuously adds new programming tasks from competitions.
Llama
Meta's open-weight LLM family that serves as foundation for thousands of fine-tuned models and has democratized open-source AI.
LLM Evals
Systematic tests that measure quality, safety, and behavior of large language models across defined tasks and metrics.
LLM Observability
LLM observability is collecting and analyzing telemetry that explains LLM system behavior in production.
LLM Routing
LLM routing is selecting which model/workflow to use for a request based on intent, risk, and cost constraints.
LLM Security
The field of security research and practices specifically for Large Language Models and generative AI.
LLM-as-a-Judge
LLM-as-a-judge uses a model to evaluate other model outputs against rubrics like correctness, groundedness, style, and safety.
LLM-as-Judge
An evaluation method where an LLM evaluates the quality of outputs from another (or the same) model.
LLMO (Large Language Model Optimization)
Large Language Model Optimization (LLMO) is the discipline of distributing brand, product and topic knowledge across the web so that large language models correctly understand, cite and reproduce it in answers — in both real-time search and training pipelines.
LLMOps
Practices and tools for developing, deploying, monitoring, and optimizing Large Language Model applications in production.
llms.txt
llms.txt is a Markdown file proposed in 2024 and widely adopted in 2025/26, placed at the root of a site (/llms.txt), that gives LLMs a curated, easily extractable overview of the most important content — analogous to sitemap.xml for search engines, but human-readable and optimized for AI models.
LMSYS
LMSYS (Large Model Systems Organization) is a research organization that operates the famous Chatbot Arena benchmark and enables LLM performance comparisons through human evaluations.
Load Balancing
Load balancing distributes incoming traffic across multiple servers to improve availability, throughput, and latency.
Locality-Sensitive Hashing (LSH)
LSH is a technique that hashes items so similar items are more likely to land in the same bucket.
Locality-Sensitive Hashing (LSH)
Locality-Sensitive Hashing (LSH) is a technique that hashes similar items into the same "buckets" with high probability, enabling fast approximate similarity search.
Log Loss
A loss function evaluating the quality of predicted probabilities – exponentially penalizes wrong but confident predictions.
Log-Likelihood
Log-likelihood is the logarithm of the likelihood that a probabilistic model assigns to observed data.
Log-Sum-Exp
Log-sum-exp is a numerical trick for computing log(∑ᵢ eˣⁱ) stably without overflow/underflow.
Logit
A logit is the raw, unnormalized score a model outputs before converting to probabilities (e.g., via softmax).
Logit Bias
Logit bias is a technique to increase or decrease the likelihood of specific tokens during generation by adjusting their logits.
Long Context
Long context refers to an LLM's ability to accept and use a large number of input tokens in a single request.
Long-Tail Keywords
Long-tail keywords are highly specific, lower-volume queries that often reflect strong intent.
Lookahead Optimizer
Meta-optimizer that maintains two sets of weights: "fast" weights (normal optimizer) and "slow" weights that are periodically interpolated toward the fast ones.
Lookalike Audience
Audience similar to existing customers based on shared characteristics.
LoRA (Low-Rank Adaptation)
An efficient fine-tuning method that trains only small adapter matrices instead of the entire model, drastically reducing memory and training costs.
LoRA Fine-Tuning
An efficient fine-tuning method that only trains small "adapter" matrices instead of all model weights – typically <1% of parameters with comparable performance.
LoRA vs Full Fine-Tuning
A comparison between adapting a model via LoRA adapters versus updating all parameters (full fine-tuning).
Loss Function
A mathematical function that measures how good or bad a model's predictions are.
Loss Landscape
The multi-dimensional surface representing loss as a function of model parameters – the "mountain" that gradient descent descends.
Lottery Ticket Hypothesis
The hypothesis that every large neural network contains a small subnetwork ("winning ticket") that, trained alone with the same initialization, can achieve the full performance of the large network.
Lovable
An AI platform that generates complete web applications from natural language descriptions – including frontend, backend, and deployment.
LSTM (Long Short-Term Memory)
LSTM is an RNN variant with gate mechanisms (forget, input, output gate) enabling learning of long-term dependencies in sequences.
Luma AI
An AI company specialized in 3D capture and video generation, known for Dream Machine and NeRF technology.
M
Mac mini M4 Pro
Apple's compact desktop with M4 Pro chip and Neural Engine, popular as an affordable on-device AI workstation.
Machine Learning
A subfield of AI where systems learn from data to make predictions or decisions without being explicitly programmed.
Machine Legibility
Machine legibility is the degree to which a website, product catalog or brand can be understood, navigated and used in answers or transactions by machines — especially AI agents and LLMs.
Machine Translation
Automatic translation of text or speech from one natural language to another using an AI system.
Machine Unlearning
Techniques to remove the influence of specific training data from an ML model without retraining the entire model.
Macro Conversion
A macro conversion is a user action that directly maps to a primary business goal (e.g., purchase, demo request, subscription).
MAE (Mean Absolute Error)
The average of absolute differences between prediction and reality – robust to outliers.
Mamba
Mamba is a neural network architecture built on selective state space models (SSMs) designed to model long sequences efficiently with linear scaling in sequence length.
Manus AI
An autonomous general-purpose AI agent capable of independently executing complex tasks like research, coding, and data analysis.
MAP (Mean Average Precision)
The average of Average Precision across all queries – considers both precision and ranking position of all relevant documents.
Marginal CPA
Marginal CPA is the cost of additional conversions at the margin—often expressed as ΔCost ÷ ΔConversions between two spend/volume scenarios.
Marginal ROAS (mROAS)
Marginal ROAS estimates the incremental revenue generated by the next unit of ad spend—i.e., "what do we get if we spend $1 more?"
Market Sentiment
Market sentiment is the overall attitude or mood of market participants toward an asset, brand, or market—often inferred from news, social media, and price/volume signals.
Marketing Agent
Specialized AI agent that autonomously executes marketing tasks like content creation, campaign management, analysis, and reporting.
Marketing Automation
The use of software to automate repetitive marketing tasks like email campaigns, social media posts, or lead nurturing.
Marketing Automation
Marketing automation uses software to automate recurring marketing tasks using rules and workflows (e.g., triggered emails, lead routing, segmentation).
Marketing Funnel
Model of the customer journey from awareness to conversion.
Marketing Measurement Framework
A marketing measurement framework is a structured system that aligns marketing goals to KPIs, data sources, and measurement methods (attribution, experiments, MMM) so performance can be evaluated consistently.
Marketing Mix Modeling
Marketing Mix Modeling (MMM) is a statistical approach that estimates how different marketing activities (channels, spend, promotions) contribute to business outcomes (sales, conversions) using aggregated time-series data.
Masked Language Modeling (MLM)
MLM is a training objective where a model predicts masked-out tokens in a text sequence (e.g., replacing words with a special [MASK] token).
Master Data Management (MDM)
Master Data Management (MDM) is an approach to ensure critical enterprise data (e.g., customers, products, locations) is consistent, accurate, and governed across systems—often aiming for a "single source/version of truth."
Mastery Learning
Mastery learning is an instructional approach where learners progress only after demonstrating mastery of a skill or objective, with targeted remediation as needed.
MATH Benchmark
A benchmark with 12,500 competition mathematics problems (from algebra to number theory) that tests advanced mathematical reasoning.
Matrix Factorization
A technique for decomposing a matrix into the product of smaller matrices.
Matryoshka Embedding
An embedding training approach where the first N dimensions of a vector are already usable – enabling flexible compression without quality loss.
Matryoshka Representation Learning (MRL)
Matryoshka Representation Learning (MRL) is an embedding approach that encodes information at multiple granularities so a single embedding can be truncated to smaller dimensions while remaining useful for downstream tasks.
Max Tokens
An API parameter that limits the maximum number of tokens an LLM can generate in a response.
MBPP (Mostly Basic Python Problems)
A benchmark with 974 simple Python programming tasks that test basic programming abilities of LLMs.
MCP (Model Context Protocol)
An open protocol by Anthropic that standardizes how AI models securely communicate with external data sources, tools, and services.
MCP (Model Context Protocol)
The Model Context Protocol (MCP) is an open standard released by Anthropic in late 2024 that lets AI models access external tools, data and systems in a structured way — a kind of "USB-C for AI".
MCP Protocol
Open protocol by Anthropic that gives LLMs standardized access to tools, data sources, and external services.
MCP Server
Server component that provides an AI model with standardized access to tools, data sources, or APIs via the Model Context Protocol (MCP).
Mechanistic Interpretability
Mechanistic interpretability is the effort to reverse engineer neural networks by identifying internal mechanisms (features, circuits, algorithms) that produce outputs.
Media Mix
A media mix is the blend of communication channels a company uses to reach an audience (often emphasizing paid channels, depending on definition).
Mel Spectrogram
A Mel spectrogram is a visual representation of audio frequencies on the Mel scale – the standard input for modern speech and audio AI models.
Membership Inference Attack
An attack that determines whether a specific data point was included in the training dataset of an ML model.
Memory Augmentation
Techniques for extending the effective context of LLMs beyond the token limit – enables memory of previous conversations, facts, and user preferences.
Memory Bandwidth
Memory bandwidth is the amount of data that can be moved to/from memory per unit time; for GPUs it strongly influences how fast data can be fed into compute.
Mental Model
An internal representation describing how a person believes a system, process, or concept works, based on experience and assumptions.
MER (Media Efficiency Ratio)
MER (often "Media Efficiency Ratio" or "Marketing Efficiency Ratio") is a top-level efficiency metric typically expressed as Total Revenue ÷ Total Marketing/Ad Spend.
Message Match
Message match is the consistency between an ad/email message and the landing page experience the user sees after clicking.
Message Passing
Message Passing is the fundamental computation paradigm of Graph Neural Networks where nodes exchange information with their neighbors.
Message Passing Neural Network
A unifying framework for GNNs where nodes receive messages from neighbors, aggregate them, and update their representations.
Meta AI
The AI research division of Meta (Facebook), known for open-source release of Llama and leading research in multimodality.
Meta-Learning
Meta-learning ("learning to learn") aims to train models or systems that adapt quickly to new tasks with limited data or few examples.
Metadata Filtering (Vector Search)
Metadata filtering restricts vector search results using structured fields (e.g., tenant_id, timestamps, doc_type) in addition to similarity search.
Metaprompt
A metaprompt is a higher-level prompt that defines the rules, structure, and constraints for generating other prompts or for a whole class of outputs.
METEOR
An evaluation metric for machine translation that combines unigram matching with stemming, synonyms, and word order.
Metric Learning
Metric learning trains models to learn a distance function (embedding space) where "similar items are close" and "dissimilar items are far apart."
Micro Conversion
A micro conversion is a smaller action that indicates progress toward a macro conversion (e.g., viewing pricing, downloading a checklist, watching a product video).
Microservices
Architecture style where an application consists of small, independent services.
Midjourney
The leading commercial text-to-image model, known for highly aesthetic, artistic image generation via Discord.
MinHash
MinHash is a technique to efficiently estimate similarity between sets (especially Jaccard similarity), commonly used for near-duplicate detection.
Minimum Description Length
Minimum Description Length (MDL) is a principle for model selection that prefers the model that yields the shortest total description of the model plus the data encoded under it.
Minimum Detectable Effect (MDE)
MDE is the smallest true effect size an experiment can reliably detect given traffic, variance, significance level, and power.
Mish Activation Function
Mish = x · tanh(softplus(x)) – a smooth, self-regularizing activation function used in YOLOv4 and some CNNs.
Mistral AI
A French AI startup developing open-weight models, considered the European alternative to US AI companies.
Mixed Precision Training
Mixed precision training uses a mix of lower-precision (e.g., FP16/BF16) and single-precision (FP32) representations to speed up training while preserving accuracy.
Mixtral
Mistral AI's Mixture-of-Experts model that achieves GPT-4-level performance efficiently by activating only a portion of parameters.
Mixture of Experts
An AI architecture where a large model consists of specialized "expert" subnetworks, of which only the most relevant ones are activated for each query – enabling efficiency with high performance.
Mixture-of-Recursion (MoR)
Architecture that lets the model decide per token how often a layer block is recursively traversed – efficient depth instead of fixed layer count.
Mixup
Data augmentation technique that creates new training examples by linearly interpolating between two existing examples.
ML Pipeline
Automated sequence of steps for data processing, feature engineering, training, evaluation, and deployment of an ML model.
MLCommons
Industry consortium developing open benchmarks (MLPerf), datasets, and best practices for ML performance.
MLflow
Open-source platform for the entire ML lifecycle: experiment tracking, model registry, deployment, and evaluation.
MLOps
MLOps is the practice of operationalizing machine learning—deploying, monitoring, versioning, and governing ML systems reliably.
MMLU (Massive Multitask Language Understanding)
A multiple-choice benchmark with 57 subject areas (STEM, humanities, social sciences) for measuring LLM world knowledge.
MMLU-Pro
Extended MMLU benchmark with more challenging multiple-choice questions and reduced guessing advantage.
MMR (Maximal Marginal Relevance)
MMR is a retrieval diversification method that selects items that are both relevant to the query and non-redundant with each other.
Moat
A moat is a durable competitive advantage that protects a business from competitors over time.
Modal
Cloud platform for serverless GPU computing that deploys ML inference and batch jobs as Python functions.
Mode Collapse
Mode collapse occurs when a generative model produces only a limited diversity of outputs, ignoring large parts of the data distribution.
Model Card
A model card is a standardized documentation artifact describing a model's intended use, limitations, training data context, evaluation results, and ethical/safety considerations.
Model Cards
Standardized documentation for ML models describing training, capabilities, limitations, bias analyses, and recommended use cases.
Model Collapse
Model collapse is a degradation phenomenon where training on synthetic/model-generated data (especially repeatedly) can reduce diversity and quality, causing the model to "collapse" toward narrower outputs.
Model Compression
Techniques for reducing the size of ML models while maintaining performance.
Model Context Protocol (MCP)
An open standard by Anthropic that defines a unified interface between AI models and external data sources, tools, and services.
Model Distillation
A technique where a large "teacher" model transfers its knowledge to a smaller, more efficient "student" model.
Model Drift
Model drift is performance degradation over time due to changes in data distributions, user behavior, environment, or upstream systems.
Model Extraction
Attacks that attempt to reconstruct or clone a proprietary ML model through systematic queries.
Model Extraction Attack
An attack where an adversary creates a functionally equivalent copy of an ML model through systematic API queries.
Model Governance
Processes and controls for the entire lifecycle of ML models: Development, validation, deployment, monitoring, and retirement.
Model Merging
Techniques for combining multiple trained models into a single model that unifies the strengths of all source models – without additional training.
Model Monitoring
Continuous monitoring of ML models in production for performance degradation, drift, fairness, and anomalies.
Model Registry
Central version management for trained ML models.
Model Routing
Automatic routing of AI requests to the optimal model based on task type, cost, latency, and quality requirements.
Model Serving
The infrastructure and processes for deploying trained ML models as API endpoints for real-time or batch inference in production environments.
Model Simplification
Model simplification reduces complexity to improve interpretability, efficiency, robustness, or deployment feasibility.
Model Spec
A model spec is a written specification describing how a model should behave—including intended behavior, constraints, and principles—often used to guide training, alignment, and deployment policy.
Model Versioning
Systematic management of different versions of trained ML models including metadata, artifacts, and lineage.
Model Watermarking
Techniques for embedding invisible markers in ML models or their outputs to prove authorship or detect unauthorized use.
Model-Based Learning
Model‑based learning learns a model of the environment (dynamics) and uses it for planning, prediction, or control.
Model-Based Reinforcement Learning
Model-based RL learns a model of the environment (dynamics model) and plans with this model instead of only learning from direct experience.
Moderation
Moderation is the detection, review, and enforcement process that applies content policy to user inputs, generated outputs, and platform behavior.
Modular Content
Content strategy that decomposes assets into reusable, AI-composable building blocks instead of producing monolithic pieces.
Modular Design
Modular design structures systems as cohesive modules with clear responsibilities and stable interfaces, minimizing coupling.
Modularity
A design principle that divides systems into independent, interchangeable components (modules) that communicate through defined interfaces.
Momentum
Acceleration technique for gradient descent that accumulates past gradient directions to converge faster and escape local minima.
Monte Carlo Dropout (MC Dropout)
Monte Carlo Dropout estimates model uncertainty by keeping dropout active at inference time and performing multiple stochastic forward passes, then aggregating results.
Monte Carlo Tree Search (MCTS)
MCTS is a planning algorithm that builds a decision tree through random simulations and identifies the most promising actions.
Moore's Law
The observation that the number of transistors on integrated circuits doubles approximately every two years, leading to exponential growth in computing power.
MQL (Marketing Qualified Lead)
An MQL is a lead that meets predefined criteria indicating higher likelihood to become a sales opportunity.
MRR (Mean Reciprocal Rank)
The average of the reciprocal ranks of the first relevant result across all queries – MRR = 1/n × Σ(1/rank_i).
MSE (Mean Squared Error)
The average of squared differences between predicted and actual values – standard loss for regression.
MT-Bench
A multi-turn conversation benchmark for LLMs with 80 questions across 8 categories, evaluated by GPT-4-as-Judge.
MTEB
The Massive Text Embedding Benchmark – a comprehensive benchmark for text embedding models across 56+ datasets in 8 tasks.
mTLS (Mutual TLS)
mTLS is a TLS setup where both client and server authenticate each other using certificates (two-way authentication).
Multi-Agent System
System of multiple specialized AI agents that collaborate to solve complex tasks that a single agent could not handle.
Multi-Agent Systems
Systems of multiple specialized AI agents working together – each agent has a role (researcher, writer, critic) and they communicate to solve complex tasks.
Multi-Armed Bandit
An algorithm for sequential decision-making that balances exploration and exploitation.
Multi-Head Attention (MHA)
Multi-Head Attention runs multiple attention computations in parallel with different learned projections and combines the results.
Multi-Objective Optimization
Multi-objective optimization (Pareto optimization) is optimization with multiple objectives that often conflict, where you typically seek Pareto-optimal solutions rather than one single optimum.
Multi-Query Attention (MQA)
Multi-Query Attention shares a single key-value head across all query heads – reduces KV cache by up to 8x with minimal quality loss.
Multi-Region
An architecture that distributes applications and data across multiple geographic data centers to optimize latency, availability, and compliance.
Multi-Teacher Distillation
A distillation method where a student model learns from multiple specialized teacher models simultaneously – combines expertise from different domains.
Multi-tenancy
Multi-tenancy is a software architecture where a single instance of an application serves multiple customers ("tenants") while keeping each tenant's data/config separated and secure.
Multi-Touch Attribution (MTA)
Attribution that distributes credit across all touchpoints in the customer journey.
Multi-Turn Conversation
A multi-turn conversation is an interaction where context and intent evolve across multiple exchanges rather than a single query-response.
Multimodal
AI systems that can process and understand multiple data types (text, image, audio, video) simultaneously.
Multimodal AI
AI systems that can process, understand, and generate multiple data types such as text, images, audio, and video simultaneously.
Multimodal AI
AI systems that jointly process text, image, audio, and video and can respond in any modality.
Multimodal Embeddings
Vector representations that project different data types (text, images, audio) into the same semantic space – enables cross-modal searching and understanding.
Multimodal Model
A multimodal model can process and/or generate across multiple data types (e.g., text, images, audio, video).
N
N-gram
Contiguous sequence of N elements (characters or words) from a text.
N-gram Blocking
N-gram blocking is a decoding constraint that prevents a model from generating an n-gram (sequence of n tokens) that has already appeared in the generated text.
N-Shot Prompting
N-shot prompting provides N examples in the prompt to teach the model the desired pattern (0-shot = instructions only; few-shot = small N).
N-Tier Architecture
N-tier architecture is a system design that separates an application into logical layers (tiers)—commonly presentation, application/business logic, and data—to improve scalability, maintainability, and security.
N+1 Tool Call Problem
The N+1 tool call problem happens when an AI workflow makes one initial tool call and then makes N additional tool calls (often one per retrieved item), causing unnecessary latency and cost.
NAC (Network Access Control)
Network Access Control (NAC) is a security approach that restricts network access based on device identity, posture, and policy (e.g., only compliant devices can access sensitive services).
NACK (Negative Acknowledgment)
A NACK is a message indicating a request/message was not successfully processed (the opposite of an ACK).
NAdam (Nesterov-Accelerated Adam)
Optimizer that integrates Nesterov momentum into Adam – combines NAG's look-ahead correction with Adam's adaptive learning rates.
Named Account List Governance
Named account list governance is the process and rules for how target account lists are created, updated, owned, and operationalized across marketing and sales.
Named Accounts
Named accounts are a defined list of target companies prioritized for go-to-market efforts, commonly used in ABM (Account-Based Marketing).
Named Entity Canonicalization
Entity canonicalization is standardizing different surface forms of the same entity into one canonical representation (e.g., "OpenAI Inc.", "OpenAI", "Open AI").
Named Entity Linking (NEL)
Named Entity Linking connects an entity mention in text (e.g., "OpenAI", "Apple", "Paris") to a specific canonical entity ID in a knowledge base (internal or external).
Named Entity Recognition (NER)
Identifying and classifying named entities in text (people, places, organizations).
Named Entity Recognition (NER)
NLP task for identifying and classifying named entities in text.
Namespace Collision
A namespace collision happens when two resources share the same name in a context where names must be unique, causing ambiguity or runtime errors.
Namespace Isolation Patterns
Namespace isolation patterns are design approaches (often in Kubernetes) that use namespaces, policies, quotas, and secrets boundaries to isolate environments or tenants.
Namespace-Scoped Secrets
Namespace-scoped secrets are secrets managed within a specific namespace boundary (commonly in Kubernetes), limiting which workloads can access them.
NaN (Not a Number)
NaN is a special floating-point value meaning "Not a Number," used to represent undefined or unrepresentable numeric results (e.g., 0/0).
Nano Banana
Codename for Google's image editing model (Gemini 2.5 Flash Image) enabling pixel-precise edits via prompt.
Nano Banana 2
Google's second-generation AI image generation model, based on Gemini 3.1 Flash Image, combining Pro quality with Flash speed.
Narrow AI / Weak AI
Narrow AI (also "weak AI") is AI designed to perform a specific task or a limited set of tasks, rather than general-purpose reasoning across domains.
NAT (Network Address Translation)
NAT maps private IP addresses to public IP addresses (and vice versa), enabling private networks to access external networks while reducing public IP usage.
Native Advertising
Native advertising is paid media designed to match the form and function of the platform where it appears (e.g., sponsored articles, in-feed sponsored posts).
Natural Experiment
A natural experiment uses real-world events or operational changes (not randomized by you) that approximate random assignment, enabling causal inference under assumptions.
Natural Gradient
Natural gradient is an optimization approach that accounts for the geometry of parameter space, often leading to more efficient steps than standard gradient descent in some probabilistic models.
Natural Language Generation
Natural Language Generation (NLG) is the process of producing human-readable text from data, intent, or internal representations (rules, templates, or neural models).
Natural Language Processing (NLP)
The field of AI concerned with the interaction between computers and human language.
Natural Language Understanding
NLU is the AI capability to understand the meaning, intent, and structure of natural language – not just recognizing words but grasping their meaning.
Natural Questions (NQ)
A question answering benchmark from Google with real search queries and Wikipedia articles as answer sources.
NCCL (NVIDIA Collective Communications Library)
NCCL is a library used for fast GPU-to-GPU communication primitives (collectives) such as all-reduce, broadcast, and all-gather—commonly in distributed training and inference.
NCCL All-Reduce
All-reduce is a collective operation that aggregates data (often summation) across devices and distributes the result back to all devices.
NDCG (Normalized Discounted Cumulative Gain)
A ranking metric that considers both relevance grades and positions in the ranking – higher-ranked relevant items are weighted more heavily.
NDJSON (Newline-Delimited JSON)
NDJSON is a format where each line is a valid JSON object—making it easy to stream, append, and process logs/events at scale.
NDR (Net Dollar Retention)
Net Dollar Retention (NDR) is essentially the same family of metric as NRR: how much revenue from existing customers you retain over time including expansion and churn (terminology varies by org).
Near-Duplicate Detection
Near-duplicate detection identifies items that are not exactly identical but are highly similar (e.g., same content with minor edits, boilerplate differences, or formatting changes).
Negative Binomial Regression
Negative binomial regression is a statistical model for count data (e.g., clicks, conversions) that handles overdispersion (variance > mean), unlike Poisson regression.
Negative Control
A negative control is a variable, outcome, or test condition that should not be affected by an intervention—used to detect bias, confounding, or measurement artifacts.
Negative Cycle
A negative cycle is a cycle in a weighted graph whose total weight is negative, allowing path cost to be reduced indefinitely by looping.
Negative Prompt
A negative prompt describes what should NOT appear in a generated image – controls diffusion models by excluding unwanted elements.
Negative Prompting
Negative prompting is explicitly telling a generative model what to avoid (content, style, formatting, claims) during generation.
Negative Transfer
Negative transfer occurs when transferring knowledge from a pretrained model or source task hurts performance on the target task.
Negative Weights
Negative weights are negative edge costs in a weighted graph (i.e., an action/transition reduces total cost).
Neo4j
Neo4j is the leading graph database that stores data as nodes and relationships, enabling efficient queries over connected data structures.
Neptune.ai
MLOps platform for experiment tracking, model registry, and metadata management with a focus on enterprise scaling.
NeRF (Neural Radiance Fields)
NeRFs are neural methods for representing 3D scenes by learning a function that maps spatial coordinates and viewing direction to color and density, enabling novel view synthesis.
Nesterov Accelerated Gradient (NAG)
Improved momentum variant that computes the gradient at a "look-ahead" point instead of the current one – faster and more stable convergence.
Net New ARR
Net New ARR is the change in annual recurring revenue from period start to period end, accounting for new sales, expansion, contraction, and churn.
Net Present Value (NPV)
NPV is the value today of future cash flows discounted by a rate that reflects time value and risk.
Net Revenue Retention (NRR)
NRR measures how much recurring revenue you retain from existing customers over a period, including expansion and churn.
Network Bandwidth
Network bandwidth is the rate at which data can be transmitted over a network (e.g., Mbps, Gbps).
Network DLP
Network Data Loss Prevention (DLP) is a set of controls that detect and prevent sensitive data from leaving a network boundary through outbound traffic (egress).
Network Effects
Network effects occur when a product becomes more valuable as more people (or organizations) use it.
Network Egress
Network egress is outbound traffic leaving a system/network (e.g., from your VPC to the internet or to external SaaS APIs).
Network Jitter
Network jitter is variation in packet delay over time (inconsistent latency), even if average latency is acceptable.
Network Latency
Network latency is the time it takes for data to travel across a network between systems (client ↔ server, service ↔ service).
Network Load Balancer
A network load balancer distributes incoming network traffic across multiple servers/instances to improve availability and performance.
Network Partition
A network partition is a failure where parts of a distributed system cannot communicate with each other, even though each part may still be running.
Network Rate Limiting
Network rate limiting restricts request rates to protect services from overload, abuse, or cost blowups.
Network Segmentation
Network segmentation is dividing a network into isolated segments to reduce attack surface, limit lateral movement, and enforce least privilege access.
Network Topology
Network topology describes how network components are arranged and connected (physical and logical layout).
Network-Aware Batching
Network-aware batching groups requests to reduce network overhead and improve throughput, especially when network latency dominates.
NetworkPolicy (Kubernetes)
A Kubernetes NetworkPolicy defines how pods are allowed to communicate with each other and with external endpoints, enabling micro-segmentation inside clusters.
Neural Architecture Search (NAS)
An AutoML approach where algorithms automatically discover the optimal neural network architecture for a given task – the "AI designs AI" approach.
Neural Audio Codec
Neural Audio Codecs compress audio into discrete tokens – the bridge between audio and language models that enables music and speech generation.
Neural Code Search
Neural code search retrieves relevant code snippets or files using embeddings and semantic matching rather than exact keyword search.
Neural Collaborative Filtering (NCF)
A deep learning approach using neural networks instead of classical matrix factorization for collaborative filtering.
Neural Collapse
Neural collapse is a phenomenon observed in deep classifiers near the end of training where learned representations and classifier weights exhibit a highly structured geometry (classes become tightly clustered and symmetrically arranged).
Neural Embeddings
Neural embeddings are learned vector representations of items (text, users, products, documents) such that distance in vector space reflects similarity.
Neural Index Rebuild
A neural index rebuild is re-generating embeddings and rebuilding vector (or hybrid) indexes after changes to content, chunking, or the embedding model.
Neural Indexing
Neural indexing is using learned representations and neural methods to build or optimize an index for retrieval (often in vector search or learned sparse retrieval).
Neural IR (Neural Information Retrieval)
Neural IR is the use of neural models (embeddings, cross-encoders, rerankers) to retrieve and rank documents based on semantic relevance.
Neural Network
A computational model inspired by the structure of biological neurons, consisting of interconnected nodes (neurons) in layers.
Neural Ordinary Differential Equation (Neural ODE)
Neural ODEs model transformations as continuous-time dynamics defined by a neural network, enabling certain efficiency and modeling properties.
Neural Processing Unit (NPU)
An NPU is specialized hardware designed to accelerate neural network computations (matrix multiplications, convolutions, attention-like ops) efficiently—often with strong power/performance advantages for specific workloads.
Neural Processing Unit (NPU)
An NPU is specialized hardware designed to accelerate neural network computations (matrix multiplications, convolutions, attention-like ops) efficiently—often with strong power/performance advantages for specific workloads.
Neural Pruning
Neural pruning removes weights, neurons, attention heads, or entire structures from a model to reduce compute/memory while trying to preserve performance.
Neural Rendering
Neural rendering combines neural networks with computer graphics to produce photorealistic images and videos – from 3D scene rendering to style manipulation.
Neural Reranking
Neural reranking uses a model (often a cross-encoder) to re-score and reorder an initial set of retrieved candidates based on deeper query–candidate understanding.
Neural Retrieval
Neural retrieval is retrieving relevant items using learned representations (dense embeddings and similarity search) instead of relying purely on keyword matching.
Neural Scaling Laws
Scaling laws describe empirical relationships showing how model performance tends to improve predictably as you increase compute, data, and/or model parameters—often following power-law-like trends.
Neural Style Transfer (NST)
Neural style transfer is a technique that applies the "style" of one image (textures, patterns) to the "content" of another, using neural representations.
Neural Topic Routing
Neural topic routing is using ML/embeddings to classify or route an input (query, pageview, conversation) into a topic, workflow, or handler based on semantic meaning.
Neural Voice Transfer
AI technology that transfers voice characteristics from one recording to another voice in real-time while preserving the content.
Neuro-Symbolic "Verification Layer"
A neuro-symbolic verification layer is a system component that checks neural outputs against symbolic constraints (rules, schemas, policies) before acting or publishing.
Neuro-Symbolic AI
Neuro-symbolic AI combines neural methods (LLMs, embeddings) with symbolic methods (rules, logic, knowledge graphs) to improve reliability, interpretability, and constraint satisfaction.
Neuromorphic Computing
Neuromorphic computing is an approach to hardware and computation inspired by biological neural systems, often emphasizing event-driven processing and energy efficiency.
New-to-File (NTF)
New-to-File refers to leads or customers who are new to your database/CRM—often used in B2B as an acquisition indicator.
Next Best Action (NBA)
AI system that determines the optimal next interaction for each customer at every moment – offer, content, channel, timing.
Next Best Question (NBQ)
Next Best Question is a conversational design and decisioning pattern where a system asks the single most valuable clarifying question to progress toward a correct outcome.
Next Sentence Prediction (NSP)
Next Sentence Prediction is a training objective where a model predicts whether one sentence likely follows another in the original text.
NHST (Null Hypothesis Significance Testing)
NHST is the traditional statistical testing framework where you test whether observed data is unlikely under a null hypothesis (often "no effect"), typically using p-values.
NIST Cybersecurity Framework (NIST CSF)
The NIST Cybersecurity Framework is a structured framework for managing cybersecurity risk through a common language, categories, and practices across the organization.
NIST SP 800-53
NIST SP 800-53 is a catalog of security and privacy controls used as a reference for designing and assessing secure systems.
NIST SP 800-63 (Digital Identity)
NIST SP 800-63 is guidance for digital identity: identity proofing, authentication, and federation concepts and requirements.
NL2SQL (Natural Language to SQL)
NL2SQL converts natural language questions into SQL queries that can be executed against a database.
NLP (Natural Language Processing)
Natural Language Processing (NLP) is the subfield of AI concerned with the machine processing, interpretation, and generation of natural language.
NLTK (Natural Language Toolkit)
The oldest and most comprehensive Python library for NLP – optimized for teaching, research, and prototyping.
NMI (Normalized Mutual Information)
NMI is a metric used to compare clustering assignments by measuring how much information one clustering shares with another, normalized to be scale-friendly.
No Free Lunch Theorem
The No Free Lunch theorem (in optimization/learning) states that averaged over all possible problems, no one algorithm performs better than all others—performance depends on the problem distribution.
Node Affinity
Node affinity is a Kubernetes scheduling feature that constrains which nodes pods can run on (based on node labels), enabling placement control.
Node Pool
A node pool is a group of compute nodes (often in Kubernetes or managed clusters) with similar characteristics, managed together for scaling and scheduling.
Node Selector
Node selector is a Kubernetes mechanism to constrain pods to run on nodes with matching labels.
Node2Vec
Node2Vec is an algorithm that represents graph nodes as low-dimensional vectors based on random walks over the graph structure.
Noise Injection
Noise injection is deliberately adding noise during training or processing to improve robustness, generalization, or privacy.
Noise Schedule
A noise schedule defines how much noise is added (and later removed) at each step in a diffusion model's forward and reverse processes.
Noise-to-Signal Ratio
Noise-to-signal ratio measures how much random variation (noise) exists relative to the meaningful pattern (signal) you want to detect.
Noisy Student Training
Noisy Student Training is a semi-supervised learning approach where a "teacher" model labels unlabeled data, and a "student" model is trained on a mix of labeled + pseudo-labeled data with noise/augmentation.
Nomic Embed
Open-source embedding models from Nomic AI with full reproducibility – all training data and code are public.
Non-Blocking I/O
Non-blocking I/O allows a program to initiate I/O operations without waiting synchronously for them to complete, enabling concurrency and better throughput.
Non-Brand Keywords
Non-brand keywords are search queries that do not include your brand name (e.g., "RAG evaluation checklist" vs "Davies Meyer AI").
Non-Idempotent Operation
A non-idempotent operation is one where repeating the same request multiple times can produce different outcomes (or duplicate side effects).
Non-Maximum Suppression (NMS)
Non-maximum suppression is a post-processing step in object detection that removes redundant overlapping bounding boxes, keeping only the most confident ones.
Non-Monotonic Logic
A logical system where conclusions can be retracted when new information arrives that contradicts previous assumptions.
Non-Negative Matrix Factorization (NMF)
NMF factorizes a non-negative matrix into two smaller non-negative matrices, often used for interpretable topic-like decompositions.
Non-Production Data Masking
Non-production data masking is the practice of anonymizing, tokenizing, or synthesizing sensitive data before it is used in dev/staging/test environments.
Non-Production Environment
A non-production environment is any environment that is not live customer production (e.g., dev, staging, test), used for development and validation.
Non-Repudiation
Non-repudiation is the ability to prove an action occurred and that a specific actor performed it—so they cannot later credibly deny it.
Non-Retryable Error
A non-retryable error is a failure that is unlikely to succeed if you simply retry (e.g., invalid input, permission denied).
Nonce Reuse
Nonce reuse is a security flaw where a "used once" value is accidentally reused, potentially enabling replay attacks or cryptographic failures (depending on context).
Nonlinear Activation Function
A nonlinear activation function introduces nonlinearity into neural networks (e.g., ReLU, GELU, tanh), enabling them to model complex relationships beyond linear transformations.
Normal Form (Database)
In databases, normal forms (1NF, 2NF, 3NF, BCNF) describe levels of normalization that reduce redundancy and improve data integrity.
Normalization
Normalization is the transformation of numerical data to a unified value range (often 0–1 or mean 0 / standard deviation 1) to improve the training stability of machine learning models.
Normalization Layer
A normalization layer is a neural network component that normalizes activations to improve training stability and convergence (e.g., LayerNorm, RMSNorm).
Normalized Cost per Answer
Normalized cost per answer is the cost of generating an AI answer adjusted for comparability (e.g., normalized by answer length, tokens, difficulty tier, or traffic segment).
Normalized RMSE (NRMSE)
NRMSE is RMSE normalized by a scale factor (e.g., range, mean, or standard deviation) to make errors comparable across datasets.
Normalizing Flow
A normalizing flow is a generative modeling approach that transforms a simple distribution (e.g., Gaussian) into a complex one via a sequence of invertible transformations with tractable likelihoods.
NoSQL
NoSQL refers to non-relational databases designed for scalability and flexibility (document, key-value, wide-column, graph databases).
Notarization (Software Artifact)
Software notarization is the process of verifying and attesting that a software artifact (binary/container/package) meets certain integrity and security requirements before it's distributed or executed.
Notebook (Jupyter Notebook)
A notebook is an interactive document that mixes code, outputs, and narrative text—commonly used for data science exploration and prototyping (e.g., Jupyter).
Notification Fatigue
Notification fatigue is reduced responsiveness or negative sentiment caused by excessive alerts, messages, or nudges.
Novel Class Discovery (NCD)
Novel class discovery finds previously unknown categories in unlabeled data while leveraging knowledge from known classes.
Nowcasting
Forecasting the current or imminent state using high-frequency real-time data.
NT-Xent Loss (Normalized Temperature-Scaled Cross-Entropy)
NT-Xent is a contrastive learning loss used to train embeddings by pulling positive pairs together and pushing negatives apart, with a temperature term controlling distribution sharpness.
Null Value
A null value represents missing or unknown data (distinct from zero, empty string, or false).
NUMA (Non-Uniform Memory Access)
NUMA is a memory architecture where memory access time depends on which CPU socket/node the memory is attached to (local memory is faster than remote).
Numerical Precision
Numerical precision is how accurately numbers are represented and computed (e.g., FP32 vs FP16/bfloat16), affecting rounding and stability.
Nurture Marketing
Nurture marketing is guiding prospects over time with helpful, staged content and experiences until they are ready for a conversion or sales engagement.
Nurture Sequence
A nurture sequence is a defined series of touches (emails, in-app messages, retargeting, content steps) triggered by behavior or segment membership.
NVIDIA AI
The dominant provider of GPU hardware and AI infrastructure, whose chips form the foundation for virtually all major AI models.
NVLink
NVLink is a high-speed GPU interconnect used to provide faster GPU-to-GPU communication than standard PCIe in many setups.
NVMe
NVMe is a storage protocol/interface designed for high-speed access to SSDs, typically offering significantly lower latency and higher throughput than older interfaces.
O
OAuth 2.0
An authorization framework that enables applications to access resources on behalf of a user or service without sharing passwords.
Object Detection
Identification and localization of objects in images or videos.
Object Storage
Stores data as objects (blob + metadata + ID), optimized for durability and scalability (e.g., documents, images, logs).
Object-Oriented Programming
A programming paradigm that organizes software around "objects" – data structures that encapsulate state (attributes) and behavior (methods).
Observability
The ability to understand a system's internal state from its outputs—typically via logs, metrics, and traces.
Observability for LLM Apps
LLM observability extends classic observability with AI-specific signals: prompt/version tracking, retrieval evidence, tool traces, token usage, and quality/safety metrics.
Observed vs Expected
Compares actual system behavior to a baseline or model of expected behavior to detect anomalies and regressions.
OCR (Optical Character Recognition)
Converts text in images (scans, screenshots, photos, PDFs-as-images) into machine-readable text.
OCR (Optical Character Recognition)
Conversion of images containing text into machine-readable text.
Off-Policy Evaluation (OPE)
Estimates how a new decision policy would perform using data collected from a different (existing) policy—without deploying the new policy.
Offline Evaluation
Measures model/system performance using predefined datasets and metrics before production rollout.
OLAP
A technology for fast, multidimensional analysis of large datasets, enabling slice, dice, drill-down, and roll-up operations.
Ollama
A user-friendly tool for running LLMs locally on consumer hardware, with simple installation and Docker-like model management.
Omnichannel
Seamless customer experience across all channels and touchpoints.
Omnichannel Marketing
Coordinating messaging and experience across channels (web, email, paid media, social, sales) so the customer journey feels consistent and connected.
On-Call
An operational practice where designated engineers respond to incidents affecting system reliability, performance, or security.
On-Call Rotation
A structured schedule for who is responsible for incident response over time, often with escalation paths and backup roles.
On-Device AI
AI inference directly on end devices (smartphones, laptops, IoT) without cloud connection – enabling real-time processing, privacy, and offline capability.
On-Device Inference
Runs a model locally on a user's device (phone, laptop, edge hardware) instead of calling a cloud API.
Onboarding
The experience and process that helps a user (or customer team) achieve meaningful value quickly and confidently.
Once-for-All (OFA)
A training method that trains a single "supernet" from which many specialized subnetworks can be extracted for different hardware constraints – train once, deploy everywhere.
One-Cycle Policy (Super-Convergence)
Learning rate schedule that first ramps up the LR (warmup) and then decreases it to a very low value – enables training in a fraction of the usual epochs.
One-Hot Encoding
Represents a categorical value as a vector of zeros with a single 1 at the category index.
One-Shot Learning
Ability to learn and generalize from a single example.
One-Shot Prompting
Provides a single example in the prompt to demonstrate the desired output pattern.
Online Distillation
A distillation variant where multiple models train simultaneously and serve as teachers to each other – no pre-trained teacher needed.
Online Evaluation
Measures performance on real user traffic (A/B tests, canaries, interleaving, holdouts) after deployment.
Online Learning
Updates a model incrementally as new data arrives, rather than retraining from scratch in large batches.
ONNX (Open Neural Network Exchange)
An open format for exchanging ML models between different frameworks – train in PyTorch, deploy with TensorRT or CoreML.
Ontology
A formal representation of concepts and relationships in a domain (entities, classes, properties, constraints).
Ontology
Formal description of concepts, properties, and relationships in a knowledge domain.
Open Graph Protocol
A set of metadata tags that control how a page appears when shared on social platforms and messaging apps (title, description, preview image).
Open Rate
Open rate is the central email-marketing metric that reports the share of recipients who actually opened a delivered email — calculated as (unique opens / delivered emails) × 100.
Open-Domain Dialogue
Open-Domain Dialogue refers to AI systems that can freely converse about any topic – without being limited to predefined intents or domains.
Open-Weight Model
A model whose trained weights are publicly available, enabling self-hosting and deeper customization.
OpenAI
A leading AI research company and developer of ChatGPT, GPT-4, DALL-E, and the world's most widely used AI applications.
OpenAI Codex
OpenAI's specialized AI model for programming – the technology behind GitHub Copilot and foundation for code LLMs.
OpenAI Embeddings
OpenAI's commercial embedding API with text-embedding-3-small and text-embedding-3-large – the easiest path to high-quality embeddings.
OpenAI o1
OpenAI's first o-series model that uses explicit reasoning with chain-of-thought for complex problem-solving.
OpenAI o3
Advanced reasoning model from OpenAI with improved performance in mathematics, coding, and scientific reasoning.
OpenAPI Specification
A standard for describing REST APIs in a machine-readable format (endpoints, parameters, auth, request/response schemas).
OpenAPI Specification
A standardized format for describing REST APIs – used by AI systems to automatically generate tool definitions for function calling.
OpenID Connect (OIDC)
An identity layer on top of OAuth 2.0 that provides authentication (who the user is) using standardized identity tokens.
OpenLLM Leaderboard
A public leaderboard by Hugging Face that compares open-source LLMs on standardized benchmarks (MMLU, HellaSwag, etc.).
OpenRouter
Unified API platform providing access to hundreds of AI models from various providers through a single interface.
OpenTelemetry (OTel)
A set of standards and tools for collecting and exporting telemetry—traces, metrics, and logs.
OpenVINO
Intel's open-source toolkit for optimizing and accelerating deep learning inference on Intel hardware (CPU, GPU, VPU, FPGA).
Operationalization
Turning a concept, model, or prototype into a repeatable, reliable, governed production capability with clear ownership, monitoring, and change control.
Operator (Kubernetes Operator)
Software that automates management of complex applications on Kubernetes using custom resources and controllers.
Operator Fusion
A compiler optimization that fuses multiple consecutive operations in neural networks into a single kernel – reducing memory accesses and accelerating inference.
Opportunity-to-Win Rate
The percentage of sales opportunities that convert to closed-won.
Opt-In Rate
The percentage of users who consent to receive communications or enable a feature.
Optical Flow
Computing motion vectors between consecutive video frames showing where each pixel moves.
Optimization
The process of finding parameter values that minimize a loss function or maximize an objective under constraints.
Optimizer
The algorithm that updates model parameters during training (e.g., SGD, Adam), based on gradients and configuration.
Orchestration
Coordinates multiple steps, services, and tools into a reliable workflow—often with state, retries, and observability.
Orchestrator
The system component that implements orchestration logic—deciding the next step, calling tools, managing state, and enforcing budgets/guardrails.
Organic Growth Loop
A self-reinforcing mechanism where product/content usage creates outputs that drive more discovery and usage without proportional paid spend.
Organic Search
Traffic earned from unpaid search engine results.
Orphan Page
A page with no internal links pointing to it, making it hard for users and crawlers to discover.
ORPO (Odds Ratio Preference Optimization)
An evolution of DPO that combines SFT and preference alignment in a single training step.
Out-of-Distribution (OOD) Detection
Identifies inputs that differ significantly from what a model was trained on, signaling increased uncertainty and risk.
Outage
A period when a service is unavailable or unusable for its intended function (full or partial).
Outage Budget (Error Budget)
A practical tolerance for downtime/unreliability within a period, derived from SLOs and risk appetite.
Outage Postmortem
A structured analysis documenting what happened, impact, root causes, contributing factors, and corrective actions after an incident.
Outbound Marketing
Outbound marketing comprises all proactive, sender-initiated activities in which companies actively reach out to prospects — via cold email, LinkedIn outreach, cold calling, direct mail, or classical TV/print advertising.
Outbox Pattern
A distributed systems design where a service writes its state changes and an "event to publish" into the same database transaction, then publishes the event reliably later.
Outcome Metrics
Metrics that measure the real-world result you care about (revenue, qualified pipeline, resolution rate, risk reduction), not just activity or engagement.
Outlier
A data point that deviates significantly from the rest of the distribution.
Outlier Detection
Identifies anomalous data points or behaviors that differ from expected patterns.
Outpainting
Outpainting extends an image beyond its original borders by generating context-aware content with AI.
Output Guardrails
Controls applied to model outputs to enforce safety, policy, formatting, and correctness constraints before displaying or acting.
Output Length Control
The set of techniques used to shape response length and structure (token limits, section caps, templates, validators).
Output Parsing
Extracting structured fields from model output (JSON, YAML, XML, or patterns) so downstream systems can reliably use it.
Output Token
A token generated by a language model as part of its response.
Over-Generation
Producing more output than needed (too long, too verbose, too many steps), increasing cost and reducing user clarity.
Over-Retrieval
Retrieving too many documents/chunks for a query, increasing cost and often reducing answer quality due to noise and context dilution.
Overfitting
When a model learns training data too well and generalizes poorly to new data.
Overlapping Chunks
A chunking strategy where consecutive text chunks share some repeated content (overlap) to preserve context across chunk boundaries.
OWASP LLM Top 10
A standardized list of the most critical security risks for LLM applications, published by OWASP.
Owned Media
Content and channels you control (website, email list, webinars, product docs), as opposed to paid or earned media.
P
p-Hacking
Manipulating analysis choices (stopping rules, segmentation, metrics, exclusions) to obtain statistically significant results.
p-Value
The probability of observing results at least as extreme as what you observed if the null hypothesis were true.
P95 / P99 Latency
Percentile measures of response time: 95% (or 99%) of requests complete faster than this value.
Page Experience
How users perceive the experience of interacting with a page—speed, stability, usability, and trust signals.
PagedAttention
A memory management technique inspired by OS virtual memory that manages KV cache in blocks, eliminating GPU memory fragmentation.
PageRank
Google's original algorithm for evaluating the importance of web pages.
Parallel Tool Calls
Executing multiple tool/API calls concurrently rather than sequentially, reducing end-to-end latency.
Parallelism
Running computations concurrently to improve throughput or reduce time-to-result.
Parameter Count
The number of learned weights in a model, often used as a rough proxy for capacity and compute needs.
Parameter Sharing
A modeling technique where multiple parts of a neural network reuse the same weights instead of having separate parameters.
Part-of-Speech Tagging
Automatically assigning parts of speech (noun, verb, adjective, etc.) to each word in a sentence.
Passage Reranking
Reorders retrieved passages using a stronger relevance model (often a cross-encoder) to improve precision before generation.
Passage Retrieval
Finds relevant passages (chunks) of text rather than whole documents, improving precision for question answering and RAG.
Pathfinding
Pathfinding is the process of finding a route between nodes in a graph that optimizes an objective (shortest, cheapest, safest, fastest).
Payback Period
Payback Period is the length of time required to recover an investment through its returns.
PCI DSS
A security standard for organizations that store, process, or transmit payment card data.
PDDL (Planning Domain Definition Language)
A standardized language for describing planning problems in AI that formally defines states, actions, and goals.
PEFT (Parameter-Efficient Fine-Tuning)
A family of techniques that adapt LLMs by training only a small subset of parameters instead of updating the entire model.
Penetration Testing
Authorized security testing where experts attempt to find and exploit vulnerabilities in a system.
Perceptron
The Perceptron is the simplest form of an artificial neuron and the foundation of modern neural networks – a linear classifier that weighted-sums inputs and passes them through an activation function.
Performance Marketing
Marketing optimized toward measurable outcomes (leads, pipeline, revenue, conversions) with a strong focus on attribution and experimentation.
Perplexity
A language model metric derived from the average negative log-likelihood; measures how "surprised" a model is by text.
Perplexity
An AI-first search engine that answers questions with cited, summarized answers – the leading Google challenger.
Persona
A research-based representation of a user segment with shared goals, constraints, and decision criteria.
Personalization
Adapts content, messaging, or experiences based on user context, intent, or segment.
Personhood Credentials
Cryptographic proofs that confirm, in the agent web, that a human (not another agent) is behind an interaction – without revealing identity.
Phi
Microsoft's Small Language Models (SLMs) that show surprisingly strong performance despite small size and enable on-device AI.
PII (Personally Identifiable Information)
Information that can identify a person directly or indirectly (e.g., name, email, phone number, government IDs).
Pika Labs
An AI video startup with user-friendly text-to-video and image-to-video generation, popular for short clips.
Pipeline Parallelism
A parallelization strategy that distributes different model layers across different GPUs – data flows through the GPU chain like a pipeline.
Pipeline Velocity
Measures how quickly opportunities move through the funnel (stages) toward closed-won.
PKI (Public Key Infrastructure)
PKI is the system of certificates, certificate authorities, and processes that enables secure identity verification and encryption using public/private keys.
Planning (AI Agents)
The ability of AI agents to break down complex goals into executable steps and develop a strategy for goal achievement.
Poisoning Attack
An attack when an adversary manipulates training data, retrieval corpora, or feedback signals to degrade model behavior.
Policy
A policy is a rule or strategy that determines what actions are taken under which conditions.
Policy Decision Point (PDP)
The component that evaluates policies and returns a decision (e.g., allow/deny/step-up auth) for a given request.
Policy Drift
When the rules a system is supposed to enforce diverge over time due to changes in code, prompts, tools, or infrastructure.
Policy Enforcement Point (PEP)
The component that enforces policy decisions at runtime (allow/deny/modify/require-confirmation).
Policy Engine
A component that enforces rules and constraints (who can do what, which tools are allowed, what outputs are permitted) at runtime.
Policy Gradient
Methods that optimize a policy directly by adjusting parameters in the direction that improves expected reward.
Policy-as-Code
Expressing governance rules in machine-readable, version-controlled code so policies can be tested, reviewed, and deployed like software.
Popularity Bias
The systematic overrepresentation of popular items in recommendations, disadvantaging niche items and reinforcing filter bubbles.
Pose Estimation
Detection and localization of body joints and skeleton keypoints in images or videos.
Positional Encoding
A method that gives transformer models information about the position of tokens in a sequence, since they have no inherent ordering information.
Positional Interpolation
A technique to extend a model's usable context length by rescaling how positions are represented.
Positioning
How you define your product/service in the minds of your target audience—what it is, who it's for, why it's different, and why that difference matters.
Post-Training
Any training stage applied after pretraining to shape a model for desired behaviors—helpfulness, safety, instruction-following.
Post-Training Quantization (PTQ)
Reduces model precision (e.g., FP16 → INT8/INT4) after training to lower memory use and speed up inference.
Posterior Collapse
Posterior collapse occurs in VAEs when the encoder learns to copy the prior instead of producing informative latent representations.
Power Analysis
Calculation of the necessary sample size to detect an effect of a given size with desired probability (power).
PPC (Pay-Per-Click)
A paid advertising model where you pay when someone clicks your ad.
Pre-LN vs. Post-LN
Refers to the placement of layer normalization in Transformer blocks: Pre-LN normalizes before attention/FFN, Post-LN after.
Pre-Training
The first training phase of an LLM where the model learns to understand and generate language from massive amounts of text (often trillions of tokens) – before specialized fine-tuning follows.
Precision
The proportion of correctly classified positive cases out of all cases classified as positive.
Precision and Recall
Two complementary metrics for evaluating classification models on imbalanced data.
Precision@k
Measures how many of the top-k retrieved items are relevant (relevant items in top-k ÷ k).
Predictive Maintenance
AI-powered prediction of machine failures before they occur to prevent unplanned downtime.
Predictive Personalization
AI predicts what a customer needs next – and personalizes proactively before the customer knows it themselves.
Prefect
Modern Python-native workflow orchestration tool as an alternative to Apache Airflow with simpler API.
Preference Data
Datasets where humans (or AI judges) indicate which of two model responses is better – the training material for RLHF, DPO, and similar alignment methods.
Preference Optimization
Training or adjusting models using preference signals (A preferred to B) to improve alignment with desired outputs.
Prefill
The inference stage where the model processes the prompt to build the initial internal state before generating output tokens.
Prefill Latency
The time spent processing the input prompt before the model can start generating tokens.
Prefix Cache
Reuses computed model state (often KV cache) for repeated prompt prefixes, avoiding repeated prefill computation.
Prefix Caching
Prefix caching stores KV cache computations for frequently reused prompt prefixes (e.g., system prompts) and shares them between requests.
Prefix Tuning
A parameter-efficient adaptation technique where you learn small "prefix" vectors that steer attention layers, instead of fine-tuning all model weights.
PReLU (Parametric Rectified Linear Unit)
A ReLU variant with a learnable negative slope parameter – the leak factor is optimized during training.
Pretraining
Training a model on large-scale data (often self-supervised) to learn general representations before task-specific adaptation.
Principle of Least Privilege
Giving users/services only the minimum permissions needed to perform their tasks—no more.
Privacy Budget
A quantitative measure (epsilon, ε) of the total privacy loss accumulated through repeated queries on privacy-protected data.
Privacy by Design
An approach where privacy protections are built into system architecture from the start, not bolted on later.
Privacy Enhancing Technologies (PETs)
The umbrella term for technologies enabling data utilization while maintaining privacy: DP, FHE, SMPC, TEEs, synthetic data, and more.
Privacy-Preserving Machine Learning
A set of techniques that reduce privacy risk when training or serving models.
Product Quantization (PQ)
A vector compression technique that approximates high-dimensional vectors using compact codes, enabling faster approximate nearest neighbor search.
Product Recommendation
AI system for predicting and displaying relevant products for each user.
Product-Market Fit
When a product satisfies a strong market demand—users repeatedly choose it, retention is healthy, and growth becomes easier.
Programmatic Advertising
Programmatic Advertising is the automated buying and selling of digital ad inventory through software and algorithms instead of manual negotiations.
Programmatic Internal Linking
Automatically creates and maintains internal links using rules, taxonomies, embeddings, and governance constraints.
Programmatic SEO (pSEO)
Creating many landing pages at scale using templates and data, targeting long-tail queries with consistent structure and internal linking.
Progressive Disclosure
A UX pattern that shows essential information first and reveals deeper detail on demand (expanders, tabs, "learn more").
Progressive Shrinking
A training technique that progressively shrinks a large network – first kernel, then depth, then width – to train a supernet supporting many subnetworks.
Prompt
The input (instructions + context + examples + constraints) provided to a language model to elicit a desired output.
Prompt A/B Testing
Comparing two prompt versions on real traffic to measure differences in outcomes and guardrails.
Prompt Budget
An explicit allocation of tokens for instructions, context, retrieved evidence, and examples.
Prompt Caching
An optimization technique where frequently used prompt prefixes are cached to reduce API costs and latency.
Prompt Chaining
Connecting multiple prompts where the output of one prompt serves as input for the next, to solve complex tasks.
Prompt Compression
Reduces prompt length while preserving essential constraints and context.
Prompt Engineering
The art and science of designing input prompts to obtain desired outputs from LLMs.
Prompt Hardening
Strengthening prompts and surrounding controls to resist misuse, injection, and unsafe outputs.
Prompt Injection
An attack technique that uses malicious inputs to manipulate the behavior of an AI system and bypass its safety guidelines.
Prompt Leakage
Unintended exposure of system prompts, hidden instructions, or sensitive context—through model outputs, logs, or UI/debug tools.
Prompt Leaking
Techniques to extract hidden system prompts from LLM applications.
Prompt Linting
Automated static analysis of prompts to detect issues before deployment (conflicts, missing constraints, unsafe phrasing).
Prompt Registry
A system for storing, versioning, testing, and governing prompts as production artifacts.
Prompt Regression Testing
Running a stable evaluation suite against prompt changes to detect quality, safety, format, and cost regressions.
Prompt Router
Selects the best prompt template (or workflow) for a request based on intent, difficulty, risk, and context.
Prompt Sandbox
A safe environment to test prompts with controlled data, tools, and logs before production.
Prompt Template
A reusable prompt structure with variables (placeholders) that can be filled dynamically.
Prompt Tokens
The tokens consumed by the model's input (system instructions, user message, retrieved context, tool schemas, examples).
Prompt Tuning
Parameter-efficient method where only learnable token embeddings at the input are trained while the entire model stays frozen.
Propensity Modeling
Prediction of the probability that a customer will perform a specific action.
Prophet (Facebook/Meta)
An open-source forecasting tool developed by Meta that automatically models trend, seasonality, and holiday effects.
Provenance
Provenance is metadata that describes the origin, history, and transformation path of data or content—where it came from, how it changed, and who/what changed it.
Proximal Policy Optimization (PPO)
A reinforcement learning algorithm that updates policies in a constrained way to avoid overly large, unstable changes.
Pruning (Neural Network Pruning)
A model compression technique that removes unimportant weights or neurons from a neural network to reduce size and accelerate inference.
Pseudonymization
Replaces identifiers with pseudonyms so data can't be directly attributed to a person without additional information kept separately.
Public Key Infrastructure (PKI)
PKI is the system of certificates, CAs, policies, and lifecycle processes used to manage trust for public/private keys at scale.
Q
Q-Former
A Q-Former is a query-based transformer module used in some multimodal systems to extract and compress information from one modality.
Q-Function
The Q-function (action-value function) maps a state-action pair to expected return: Q(s, a).
Q-Learning
Q-learning is a reinforcement learning method that learns a value function Q(s, a) estimating the expected return of taking action a in state s.
QA (Quality Assurance)
Quality assurance is the systematic process of ensuring outputs meet defined standards—correctness, consistency, safety, usability, and compliance.
QAT (Quantization-Aware Training)
Quantization-aware training trains a model while simulating quantization effects, improving accuracy after quantization compared to PTQ.
QBR (Quarterly Business Review)
A QBR is a structured quarterly review between a vendor/team and stakeholders to assess performance, outcomes, risks, roadmap, and priorities.
QDF (Query Deserves Freshness)
QDF is an SEO concept describing when search engines may prioritize fresher content for queries with strong recency intent.
Qdrant
Qdrant is a vector database used for storing embeddings and performing similarity search (often for RAG and semantic search).
QKV (Query–Key–Value)
QKV refers to the Query (Q), Key (K), and Value (V) matrices used in transformer attention mechanisms.
QLoRA (Quantized LoRA)
A combination of quantization and LoRA that enables fine-tuning of LLMs with drastically reduced memory requirements by quantizing the base model while training only LoRA adapters in full precision.
QoS (Quality of Service)
Quality of Service is the ability to prioritize and manage traffic so critical workloads meet performance guarantees.
QPS (Queries Per Second)
QPS measures how many queries a system can handle per second—often used for search services, APIs, and inference endpoints.
Quadratic Attention Cost
Quadratic attention cost refers to the classic computational scaling of full self-attention, which grows roughly with the square of sequence length (O(n²)).
Quality Drift
Quality drift is a gradual degradation of output quality over time due to changes in data, prompts, retrieval corpora, user behavior, or system dependencies.
Quality Filter
A quality filter is a rule or model that blocks, flags, or degrades outputs that fail quality criteria.
Quality Gates
Quality gates are automated (and sometimes human) checks that content or system changes must pass before release or publication.
Quality Score
Google's rating of the quality and relevance of ads and keywords.
Quality Score (Paid Search)
Quality Score is a platform metric that reflects expected ad quality and relevance, often influencing ad rank and CPC.
Quality-Adjusted Cost per Answer
Quality-adjusted cost per answer is cost-per-answer interpreted alongside quality metrics, ensuring cost savings don't come from degraded outputs.
Quality-of-Answer Score
A quality-of-answer score is a composite metric that estimates how good an AI answer is (usefulness, correctness, clarity, groundedness, safety).
Quantile
A quantile is a value below which a certain percentage of observations fall (e.g., p50/median, p95, p99).
Quantile Regression
Quantile regression predicts a chosen quantile of the target distribution (e.g., p90 outcome) rather than the mean.
Quantization
A compression technique that reduces the precision of model weights from 32-bit floating point to lower bit formats (INT8, INT4) to drastically reduce memory and computation requirements.
Quantization-Aware Training (QAT)
A training method that simulates quantization errors during training so the model learns to handle lower precision – higher quality than post-training quantization.
Quantum Machine Learning (QML)
Quantum machine learning explores using quantum computing concepts (qubits, superposition, entanglement) to accelerate or enhance certain ML computations.
Quarantine
Quarantine is isolating content, inputs, or events that are suspicious, unsafe, or low-trust so they cannot affect production outputs.
Quasi-Experiment
A quasi-experiment estimates causal effects without random assignment, using designs like difference-in-differences, regression discontinuity, or matching.
Quasi-Identifier
A quasi-identifier is a data attribute (or combination) that may not uniquely identify someone alone, but can identify them when combined with other attributes.
Query (Search Query)
A query is a user's search input—typed or spoken—that expresses intent and triggers retrieval, ranking, and results generation.
Query Cache
A query cache stores results of frequent queries so subsequent identical queries can be served faster and cheaper.
Query Embeddings
Query embeddings are vector representations of search queries used for semantic similarity matching against embedded documents/passages.
Query Expansion
Query expansion augments a query with additional terms or semantic signals to improve retrieval recall.
Query Fan-Out
Query fan-out is when one request triggers many downstream queries/tool calls to gather context or results.
Query Federation
Query federation executes a query across multiple systems/sources (databases, services, indexes) and combines results.
Query Likelihood Model
A query likelihood model is an information retrieval approach where documents are ranked by the probability that the document's language model would generate the query.
Query Optimizer
A query optimizer is the system component that chooses an efficient query plan, often based on statistics and heuristics.
Query Plan
A query plan is the execution strategy a database/search engine uses to answer a query (joins, index usage, filters, scan order).
Query Reranking
Query reranking reorders search/retrieval results using a stronger scoring function (often a cross-encoder or LLM-based scorer) to improve relevance at the top.
Query Rewrite
Query rewrite is modifying a search query to improve retrieval quality (recall/precision), often by clarifying intent, expanding terms, or normalizing vocabulary.
Query Rewriting
Transforming a user query into a form that yields better retrieval results.
Query Routing
Query routing sends a query to the most appropriate engine, model, index, or workflow based on intent, confidence, and constraints.
Query String (URL Parameters)
A query string is the part of a URL after ? that passes parameters (e.g., ?utm_source=...).
Query Understanding Evaluation
Query understanding evaluation measures how well your system interprets user intent, entities, constraints, and risk level from queries.
Query-Time Filtering
Query-time filtering applies constraints during retrieval—such as permissions, tenant boundaries, recency windows, language, or document type.
Query-to-Content Mapping
Query-to-content mapping is the practice of aligning specific query intents to the most relevant page type, section layout, and next step (CTA).
Question Answering (QA)
Question Answering is a task where a system answers questions based on a corpus, knowledge base, or model knowledge.
Question Decomposition
Question decomposition breaks a complex question into smaller sub-questions that can be answered more reliably.
Queue
A Queue is a data structure following the FIFO principle (First In, First Out), where elements are processed in the order of their arrival.
Queue Depth
Queue depth is the number of pending messages/jobs waiting in a queue.
Queue Latency
Queue latency is the distribution of queue time (p50/p95/p99) for queued tasks.
Queue Time
Queue time is the time a request/job spends waiting in a queue before processing begins.
Queueing Theory
Queueing theory studies waiting lines (queues) to understand throughput, utilization, and latency under load.
Quick Fix
A quick fix is a small, fast change intended to mitigate an issue immediately (often a tactical patch), usually followed by a deeper root-cause fix.
Quickstart
A quickstart is a minimal, guided path that helps users achieve a first successful outcome quickly (often 5–15 minutes).
Quiet Period
A quiet period is a defined time window where teams avoid making changes that could confound measurement or increase risk.
Quorum
A quorum is the minimum number of participants/nodes required to agree or be present for a system to make a valid decision.
Quota Exhaustion
Quota exhaustion occurs when a user/tenant reaches a quota limit and further actions are blocked or throttled.
Quota-Aware Routing
Quota-aware routing chooses models/workflows based on remaining quota and cost budgets (e.g., route simple queries to cheaper modes when budget is low).
Quotas
Quotas are enforced limits on usage of a resource (requests, tokens, compute, storage, tool calls) within a defined scope.
Quoted Query
A quoted query uses quotation marks to force exact phrase matching in some search engines/tools (behavior varies by engine).
Qwen
Alibaba's open-weight LLM family that competes with Llama and Mistral in many benchmarks and offers strong multilingual capabilities.
R
R-Squared (Coefficient of Determination)
The proportion of variance in the target variable explained by the model (0-1).
R&D (Research & Development)
Systematic activities to gain new knowledge (research) and apply it to develop new products, services, or processes (development).
RAG (Retrieval-Augmented Generation)
Retrieval-Augmented Generation (RAG) is an architecture where an LLM generates an answer using retrieved external information (documents/chunks) as evidence, rather than relying only on its internal parameters.
RAG Chunking Strategy
A RAG chunking strategy defines how source documents are split into retrievable units (chunk size, overlap, structure preservation, metadata).
RAG Evaluation
The systematic evaluation of RAG systems across retrieval quality, answer relevancy, groundedness, and faithfulness.
RAG Poisoning
RAG poisoning is an attack or failure mode where the retrieval corpus is manipulated so that malicious or misleading content is retrieved as "evidence," degrading outputs or steering the system.
Ragas
Ragas is a popular evaluation approach/library for RAG systems that provides practical metrics and workflows to assess retrieval + generation quality.
Random Search
Hyperparameter tuning by randomly sampling from the parameter space – more efficient than grid search with the same compute budget.
Rasa
Rasa is an open-source framework for building Conversational AI – with NLU, Dialogue Management, and integrations for enterprise chatbots.
Rate Limiting
Rate limiting restricts how many requests (or actions) a client can perform in a given time window.
Rate-Limit Backoff
Rate-limit backoff is adapting request behavior when receiving throttling signals (e.g., HTTP 429), typically by slowing down, retrying later, and/or shedding load.
Ray Serve
Scalable model serving framework based on Ray for real-time inference with composition patterns and auto-scaling.
RBAC (Role-Based Access Control)
RBAC assigns permissions to roles (e.g., "viewer," "editor," "admin") and assigns users/services to those roles.
RBAC/ABAC
RBAC (Role-Based Access Control) grants permissions via roles; ABAC (Attribute-Based Access Control) grants permissions via policies over attributes (user, resource, context).
RCA (Root Cause Analysis)
Root cause analysis is the process of identifying the underlying causes of an incident—not just symptoms—and defining corrective actions.
RDF
RDF (Resource Description Framework) is a standard model for data interchange on the web that represents information as subject-predicate-object triples (facts).
Re-Embedding
Re-embedding is regenerating embeddings for a corpus (documents/chunks) using the same or a new embedding model, then updating the vector index accordingly.
ReAct (Reason + Act)
ReAct is an agentic pattern where a model alternates between reasoning and taking actions (tool calls), incorporating observations before continuing.
ReAct (Reasoning + Acting)
A prompting paradigm that connects reasoning (thinking) and acting (doing) in a loop – the LLM thinks aloud, executes actions, and reflects on results.
Real-Time Bidding (RTB)
Auction-based real-time purchase of ad inventory per impression.
Real-Time Bidding (RTB)
Real-Time Bidding (RTB) is an automated auction process where ad inventory is auctioned off in milliseconds while a page loads.
Real-Time Personalization
Personalization that happens during the active session – every click immediately changes the experience.
Reasoning Model
AI models that perform and show explicit thinking steps before generating a final answer – optimized for complex reasoning.
Reasoning Models
A new class of LLMs (OpenAI o1, o3, DeepSeek R1) that perform explicit step-by-step reasoning before answering – "thinking" becomes visible and improves complex problem-solving.
Recall
The proportion of correctly identified positive cases out of all actual positive cases.
Recall@k
Recall@k measures how often the needed relevant item(s) appear within the top-k retrieved results.
Recency Bias
Recency bias is a tendency to overweight more recent information—either in human judgment or in system behavior (ranking, context usage).
Reciprocal Rank Fusion (RRF)
RRF combines multiple ranked result lists into one by summing reciprocal ranks, improving robustness when different retrieval methods excel on different queries.
Recommendation Engine
System that generates personalized recommendations based on user behavior.
Recurrent Neural Network (RNN)
RNNs process sequences by passing a hidden state across timesteps – the original architecture for language and time series, now largely replaced by Transformers.
Recursion
A programming concept where a function calls itself to break down a problem into smaller, similar subproblems.
Red Teaming
The systematic attempt to find vulnerabilities and dangerous behaviors in AI systems before they are exploited by malicious actors.
Redaction
Redaction is removing or masking sensitive information (PII, secrets, credentials) from text, logs, documents, or outputs.
Redirects
Redirects forward a request from one URL to another, commonly using HTTP status codes like 301 (permanent) and 302 (temporary).
Reflection Agent
An agent pattern where the LLM critically evaluates its own outputs and iteratively improves them – like an internal code review.
Regression
ML method for predicting continuous numerical values.
Regression Testing
Regression testing ensures that changes (code, prompts, retrieval config, model versions) don't break existing behavior or quality.
Regularization
Techniques that prevent overfitting by constraining model complexity.
Reinforcement Learning
A learning paradigm where an agent learns by interacting with an environment to maximize rewards.
Reinforcement Learning (RL)
Reinforcement learning is a paradigm where an agent learns to make decisions by interacting with an environment and optimizing cumulative reward.
Relation Extraction
Relation Extraction identifies and classifies semantic relationships between entities in unstructured text.
ReLU (Rectified Linear Unit)
ReLU is the most used activation function in deep learning: f(x) = max(0, x) – simple, fast, and effective against vanishing gradients.
Remarketing
Re-engaging users who have already interacted with the brand.
Reparameterization Trick
The reparameterization trick enables backpropagation through stochastic sampling operations by treating randomness as an external variable.
Replicate
Cloud platform for hosting and running open-source ML models via API with Cog packaging.
Replit AI
The AI features of the cloud development platform Replit – from code assistant to autonomous app builder.
Reporting
The process of collecting, organizing, and presenting data in structured formats (reports, dashboards) to inform stakeholders and support decisions.
Reproducibility
Reproducibility is the ability to recreate the same (or equivalent) outputs and behavior given the same inputs, versions, and configuration.
Request Coalescing
Request coalescing merges multiple identical (or similar) concurrent requests into a single upstream request, then shares the result.
Reranker
A reranker is a model that re-scores and reorders retrieved candidates (documents/chunks) to improve relevance at the top.
Reranking
Reordering retrieval results with a more powerful model for better relevance.
Residual Connection
Residual connections add a layer's input to its output, allowing gradients to flow directly through deep networks.
ResNet
A CNN architecture with skip connections (residual connections) that enables training of very deep networks.
Response Generation
AI process for generating natural language responses.
Response Schema
A response schema is a formal structure the system requires for outputs (fields, types, required sections), often enforced with validation.
Response Streaming
Response streaming sends model output to the client incrementally as it's generated, improving perceived responsiveness (time-to-first-token).
Response Validation
Response validation checks that outputs meet required structure, policy constraints, and quality rules before display or execution.
Responsible AI
A holistic approach to developing and deploying AI systems that prioritizes ethical principles such as fairness, transparency, privacy, and human oversight.
Retargeting
Retargeting serves ads to users who previously interacted with your site or content, aiming to bring them back to convert.
RetNet (Retentive Network)
An architecture from Microsoft combining Transformer quality with linear inference complexity through a "retention" mechanism.
Retrieval Confidence
Retrieval confidence is a signal estimating whether retrieved results contain sufficient, relevant evidence to answer the query reliably.
Retrieval Drift
Retrieval drift is a change in retrieval behavior/quality over time due to corpus updates, embedding model changes, indexing settings, query distribution shifts, or metadata changes.
Retrieval-Augmented Generation
An AI architecture that connects Large Language Models with external knowledge sources by retrieving relevant documents and using them as context for response generation.
Retrieval-Augmented Generation (RAG)
A technique that combines LLM generation with external knowledge retrieval to provide more grounded and current responses.
Retrieval-First Policy
A retrieval-first policy forces the system to retrieve evidence before generating substantive answers, especially for factual or high-risk queries.
Retriever
A retriever is the component that selects candidate documents/chunks relevant to a query (keyword, vector, hybrid, or federated).
Retriever-Reranker Cascade
A retriever–reranker cascade is a two-stage retrieval approach: a fast retriever generates candidates, then a slower, more accurate reranker selects the best top-k.
Retry
A retry is re-attempting a failed operation (API call, tool call, retrieval request) to recover from transient errors.
Retry Storm
A retry storm is a feedback loop where failing requests trigger retries that increase load, causing more failures and even more retries.
Retryable Error
A retryable error is a failure that may succeed on retry (e.g., transient network issues, temporary overload, rate limiting).
Return on Investment (ROI)
Metric for measuring the return on an investment.
Reward Hacking
Reward hacking occurs when a model/agent finds ways to maximize reward without actually achieving the intended real-world goal.
Reward Model
A reward model scores model outputs according to a preference objective (helpfulness, safety, format compliance), often used in alignment-style training or evaluation.
RFM Analysis
Customer segmentation based on Recency, Frequency, and Monetary value.
RFP (Request for Proposal)
An RFP is a formal document organizations use to solicit vendor proposals for a project, often with requirements for security, compliance, delivery, and pricing.
Right to Explanation
The legal or ethical right of affected individuals to receive an understandable explanation for automated decisions.
Ring Attention
A distributed attention technique that distributes long sequences across multiple GPUs by passing KV blocks in a ring between devices.
Risk Classification (AI Act)
Classification of an AI system into one of the four AI Act risk classes as the basis for applicable obligations.
Risk Register
A risk register is a structured list of risks, their likelihood/impact, mitigations, owners, and review cadence.
RLAIF (Reinforcement Learning from AI Feedback)
RLAIF uses AI-generated critiques or preferences (often from a judge model) as feedback signals to improve model behavior, reducing reliance on human labeling.
RLEF (Reinforcement Learning from Execution Feedback)
Training paradigm where a model learns from the actual outcome of its tool calls (code execution, API response, test pass) – not from human feedback.
RLHF (Reinforcement Learning from Human Feedback)
A training method that uses human feedback to make LLMs more helpful, safer, and better aligned – the key to "alignment" in modern ChatGPT-like models.
RMSE (Root Mean Squared Error)
The square root of MSE – has the same unit as the target variable.
RMSNorm (Root Mean Square Normalization)
A simplified variant of layer normalization using only root mean square without mean centering – faster and standard in LLaMA/Mistral.
RMSprop
Adaptive optimizer that solves AdaGrad's problem by using an exponentially weighted average of squared gradients instead of their sum.
RNN (Recurrent Neural Network)
A Recurrent Neural Network (RNN) is a neural network architecture for sequential data where neurons use their own output as additional input for the next time step — preserving context across sequences.
ROAS (Return on Ad Spend)
ROAS is revenue attributed to advertising divided by ad spend.
Robotics (AI)
The field of developing intelligent robots that use AI to autonomously perceive, plan, and execute tasks in the physical world.
Robustness Testing
Robustness testing evaluates how reliably a model or system performs under perturbations, edge cases, noise, or distribution shifts.
ROC Curve
A plot showing the True Positive Rate vs False Positive Rate across all classification thresholds.
Rollback
A rollback reverts a deployment/change to a previous known-good version (code, model, prompt, index, policy).
RoPE (Rotary Position Embedding)
A method for encoding positional information in Transformers by rotating Query and Key vectors, naturally capturing relative positions.
RoPE (Rotary Positional Embeddings)
RoPE is a positional encoding method that applies rotations to query/key vectors, enabling models to represent token positions in a way that supports relative position behavior.
ROUGE Score
Metrics for evaluating automatic text summarization.
Routing Policy
A routing policy is the rule set that decides which model/workflow/tools to use for a request based on intent, risk, confidence, and budgets.
Row Store
A row store database stores data row-by-row, optimizing for transactional workloads (OLTP) and retrieving full records efficiently.
RPO (Recovery Point Objective)
RPO is the maximum acceptable amount of data loss measured in time (e.g., "no more than 15 minutes of data").
RTO (Recovery Time Objective)
RTO is the maximum acceptable time to restore a service after an outage.
Runbook
A runbook is an operational guide for diagnosing and resolving specific incidents, including steps, decision points, and escalation paths.
Runway
A leading AI video platform with text-to-video, image-to-video, and advanced editing tools for creative professionals.
RWKV (Receptance Weighted Key Value)
An open-source architecture combining RNN efficiency (O(1) inference per token) with Transformer-like parallelizability during training.
S
S4 (Structured State Spaces)
The groundbreaking state space architecture combining HiPPO initialization with efficient convolution computation that sparked the SSM revolution.
SaaS-pocalypse
Term for the thesis that many classic SaaS tools will be made obsolete by agentic AI workflows.
Safety
Safety in AI systems is the set of measures that prevent harmful, insecure, or policy-violating outputs and actions—especially under adversarial or ambiguous inputs.
Safety Alignment
Safety alignment is shaping model/system behavior so it reliably follows safety constraints (refusals, safe defaults, policy adherence) across normal and adversarial inputs.
Safety Case
A safety case is a structured argument—supported by evidence—that a system is acceptably safe for a specific context and risk profile.
Safety Classifier
A safety classifier is a model/rule system that detects unsafe content or risky intent (e.g., self-harm, hate, data exfiltration attempts, policy violations).
Safety Evaluation
Safety evaluation is the systematic testing of an AI system for harmful, policy-violating, insecure, or privacy-risk behavior—across normal and adversarial inputs.
Safety Filters
Safety filters detect and block or transform unsafe outputs (or unsafe inputs) based on policy (e.g., sexual content, violence, hate, self-harm, illegal instructions).
Safety Guardrails
Safety guardrails are mechanisms that constrain an AI system's behavior to reduce harm (policies, validators, permission boundaries, rate limits, refusals).
Safety Incident Taxonomy
A safety incident taxonomy is a structured classification system for AI safety incidents (what happened, severity, impact, root cause, mitigation).
Safety Training
The process of making LLMs safer through specialized training – includes RLHF, DPO, Constitutional AI, and red-teaming-based training.
Sales Qualified Lead
A Sales Qualified Lead (SQL) is a lead deemed ready for direct sales engagement based on qualification criteria (fit + intent + readiness).
Saliency Map
Visualization showing which input pixels or tokens have the greatest influence on model output, based on gradients.
SAM (Segment Anything Model)
A foundation model by Meta for universal image segmentation that can segment any object in an image with zero-shot capability.
SAML
SAML (Security Assertion Markup Language) is a standard for single sign-on (SSO) that exchanges authentication and authorization data between an identity provider and a service provider.
Sampling
Sampling is selecting a subset of data (or outcomes) from a larger population/process to estimate properties, reduce cost, or enable exploration.
Sampling Steps
Sampling steps are the number of iterative denoising iterations used during diffusion inference to generate an output.
Sampling Temperature
Sampling temperature scales the model's output distribution: lower temperatures make outputs more deterministic; higher temperatures increase randomness.
Sandbox Environment
A sandbox environment is an isolated, non-production environment used to test workflows, integrations, prompts, and tool actions safely.
SARSA (State-Action-Reward-State-Action)
SARSA is an on-policy RL algorithm that updates Q-values based on the action actually taken – unlike Q-Learning's off-policy maximum.
Satisficing
Satisficing is choosing a solution that is 'good enough' to meet constraints, rather than optimizing for the absolute best.
Scalable Oversight
Methods to monitor and correct AI systems that exceed human capabilities – how do you oversee something smarter than yourself?
Scaled Dot-Product Attention
The base attention computation: Attention(Q,K,V) = softmax(QK^T / √d_k) · V – the mathematical foundation of all Transformers.
Scaling Laws
Scaling laws are empirical relationships showing how model performance tends to improve predictably as you scale data, compute, and parameters.
Scenario Analysis
Scenario analysis evaluates outcomes under a set of coherent, plausible future conditions (scenarios), rather than changing one variable at a time.
Scene Understanding
AI ability to holistically understand complex visual scenes – objects, their relationships, context, and implicit meaning.
Schema
A Schema defines the structure, organization, and constraints of data – whether in databases, APIs, or structured data formats.
Schema Drift
Schema drift is when the expected structure of data changes over time (fields added/removed/renamed, types change, enums expand), often breaking pipelines.
Schema Validation
The process of verifying whether data (typically JSON) conforms to a defined schema – essential for reliable AI outputs and API integrations.
Schema-on-Read
Schema-on-Read is a data management approach where the structure of data is applied only at query time, not when storing.
Schema.org DefinedTerm
Schema.org DefinedTerm is structured data markup for representing a term and its definition in a machine-readable way.
Score Matching
Score matching learns the gradient of the log-probability density (score function) of a data distribution to generate samples via Langevin dynamics.
SCORM/xAPI
SCORM and xAPI (Experience API, "Tin Can") are standards for packaging, delivering, and tracking learning experiences in learning platforms.
SDK
An SDK (Software Development Kit) is a set of tools, libraries, and documentation that helps developers integrate with a platform or API.
SDLC is Dead
Thesis that the classic Software Development Lifecycle (analysis, design, code, test, deploy) is being replaced by agentic development loops.
Search AI Answers
Search AI answers are AI-generated responses presented directly in a search interface, often synthesizing information from multiple sources rather than returning only a list of links.
Search Algorithm
A procedure for systematically traversing a data space to find a specific element or identify a solution to a problem.
Search Console
Search Console (often referring to Google Search Console) is a tool for monitoring how a site performs in organic search: indexing, visibility, clicks, queries, and technical issues.
Search Engine Optimization (SEO)
Optimizing websites for better rankings in organic search results.
Search Intent
Search intent is the underlying goal behind a query—what the user is actually trying to accomplish (learn, compare, buy, troubleshoot, validate).
SearchGPT
OpenAI's real-time web search feature integrated into ChatGPT – combines conversation with current web information.
Seasonality
Regularly recurring patterns in time series that repeat at fixed intervals.
Secrets Management
Secrets management is securely storing, accessing, rotating, and auditing secrets such as API keys, tokens, and credentials.
Secure Aggregation
A cryptographic protocol that allows a server to compute aggregate values from individual contributions without seeing the individual values.
Secure by Design
Secure by design means security is built into system architecture from the start via safe defaults, least privilege, and defense-in-depth—rather than patched later.
Secure Egress Control
Secure egress control restricts and monitors outbound network access from systems to reduce data exfiltration risk (allowlists, proxies, DNS controls).
Secure Enclave
A secure enclave is a hardware-backed isolated execution environment designed to protect data and code while in use.
Secure Multi-Party Computation
A cryptographic protocol where multiple parties jointly compute a function without revealing their respective input data to each other.
Secure Tool Calling
Secure tool calling is executing actions via tools/APIs in a way that enforces authorization, validation, and safety—without relying on the LLM's good behavior.
Security
Security is protecting systems and data against threats by ensuring confidentiality, integrity, and availability (CIA), plus accountability and resilience.
Security Posture
Security posture is the overall security state of a system, measured by controls, configuration, monitoring coverage, and incident readiness.
Seedance
AI video generator by ByteDance with controversial training data origins and photorealistic results.
Segment Analysis
Segment analysis breaks metrics down by meaningful groups (segments) such as channel, device, region, customer tier, or intent.
Segmentation
Dividing a population into homogeneous groups based on shared characteristics.
Seldon Core
Kubernetes-native open-source platform for deploying, scaling, and monitoring ML models in production.
Selective Prediction
An approach where a model refuses uncertain predictions and delegates to humans or other systems.
Self-Attention
Attention mechanism where input elements are related to each other.
Self-Consistency
Self-consistency is a technique where you sample multiple reasoning paths/answers and aggregate them (e.g., majority vote) to improve reliability.
Self-Distillation
A variant of knowledge distillation where a model uses itself as teacher – the same or identical model serves as teacher for a new training run.
Self-Play
Self-Play is an RL training method where an agent plays against copies of itself, continuously improving through competition.
Self-Supervised Learning
Learning paradigm where the model generates labels from the data itself.
Self-tuning Systems
Self-tuning systems automatically adjust internal parameters to maintain or improve performance under changing conditions.
SELU (Scaled Exponential Linear Unit)
A self-normalizing activation function that automatically centers outputs to mean 0 and variance 1 – no batch/layer norm needed.
Semantic Caching
Semantic caching reuses past answers/results when a new query is semantically similar to a previous query, not necessarily identical.
Semantic Chunking
Semantic chunking splits documents into chunks based on meaning boundaries (topics/sections) rather than fixed token counts alone.
Semantic Router
A semantic router routes queries to the right workflow, toolset, or model using semantic signals (embeddings, intent classification, similarity to known categories).
Semantic Search
A search method that understands the meaning and context of queries rather than just matching exact keywords – enabling more natural and intelligent search results.
Semantic Segmentation
Pixel-level classification of image regions by object categories.
Semantic Versioning
Semantic versioning (SemVer) is a versioning convention: MAJOR.MINOR.PATCH, where MAJOR indicates breaking changes, MINOR indicates backward-compatible features, PATCH indicates backward-compatible fixes.
Semantic Web
The Semantic Web is an extension of the World Wide Web that structures data in machine-readable formats so computers can understand and process their meaning.
Sensitivity Analysis
Sensitivity analysis evaluates how changes in inputs affect outputs, to understand robustness and key drivers.
Sensor Fusion
Combining data from multiple sensors (camera, LiDAR, radar, IMU) into a consistent environment model for more robust perception.
Sentence Transformers
A Python library and collection of models that produce semantically meaningful sentence embeddings – optimized for similarity search and clustering.
SentencePiece
Language-independent open-source tokenizer framework by Google that works directly on raw text without prior word segmentation.
Sentiment Analysis
The detection and classification of emotional tone (positive, negative, neutral) in text.
Sentiment Score
Numerical value that quantifies the emotional polarity of a text.
Sequence-to-Sequence
A model architecture that transforms an input sequence into an output sequence of variable length.
SERP Features
SERP features are non-traditional search result elements beyond the standard "10 blue links," such as featured snippets, knowledge panels, FAQs, and other enriched modules.
Server-Sent Events
Server-Sent Events (SSE) is a web technology that streams real-time updates from server to client over a single HTTP connection.
Server-Side Rendering
Server-Side Rendering (SSR) generates page HTML on the server per request (or per route) rather than relying entirely on client-side JavaScript.
Service Account
A service account is a non-human identity used by applications/services to authenticate to other systems and perform actions programmatically.
Service Level Agreement (SLA)
A Service Level Agreement (SLA) is a contract between service provider and customer that defines measurable quality standards such as availability, response times, and support levels.
Service Mesh
A service mesh is an infrastructure layer (often via sidecars or proxies) that manages service-to-service communication with consistent security, observability, and traffic policies.
Session
Period of user interaction with a website or app.
Session-Based Recommendation
Recommendations based on the current user session rather than historical profiles – ideal for anonymous visitors.
Sessionization
Sessionization groups user events into sessions to analyze behavior over time (page flows, search sequences, conversions).
SFT (Supervised Fine-Tuning)
Supervised fine-tuning (SFT) adapts a pretrained model using labeled input→output examples to shape behavior (format, style, task performance).
SFT (Supervised Fine-Tuning)
Training a pre-trained model on curated (input, output) pairs to adapt it to specific tasks or formats.
Shadow Deployment
A shadow deployment runs a new model/system version on real traffic without affecting user outputs, to evaluate behavior safely.
SHAP (Shapley Additive Explanations)
SHAP is a model explainability method based on Shapley values from cooperative game theory that attributes a prediction to individual features.
Sharding
Sharding partitions a dataset across multiple databases or nodes (shards) to scale storage and throughput.
Share of Model
Share of Model (SoM) is a 2025/26 marketing metric that measures how often a brand appears as source, example or recommendation in answers from generative AI models (ChatGPT, Claude, Gemini, Perplexity) — relative to competitors in a defined topic set.
Share of Search
Share of search estimates brand demand by measuring the proportion of search queries for your brand vs competitors (or vs category).
Sharpness-Aware Minimization (SAM)
Optimization method that minimizes not only the loss but also the "sharpness" of the loss landscape – finds flatter minima for better generalization.
Shortest Path
An algorithm problem that finds the optimal (shortest, fastest, or cheapest) route between two nodes in a graph.
Siamese Network
A Siamese network is a neural architecture with two (or more) identical subnetworks that learn to compare inputs by producing embeddings and measuring similarity.
SIEM
SIEM (Security Information and Event Management) is a system that aggregates security logs/events for detection, investigation, and compliance reporting.
Sigmoid Function
The Sigmoid function σ(x) = 1/(1+e^(-x)) maps any value to the range (0, 1) – historically important as activation function, today primarily for binary classification.
Signal-to-Noise Ratio
Signal-to-noise ratio (SNR) is the proportion of meaningful information ("signal") relative to irrelevant or misleading information ("noise").
Signed Webhook
A signed webhook includes a cryptographic signature so the receiver can verify the request really came from the sender and wasn't tampered with.
SiLU / Swish
SiLU/Swish = x · σ(x) – a smooth, self-gated activation function that outperforms ReLU in many benchmarks and is the basis of SwiGLU.
Sim-to-Real Transfer
Transferring AI models trained in simulation to real physical systems – train in the virtual world, deploy in the real one.
SimCLR
SimCLR (Simple Contrastive Learning of Visual Representations) is a framework for self-supervised learning that learns visual representations by comparing augmented image versions.
SimHash
SimHash is a fingerprinting method that produces a compact hash where similar documents tend to have similar hashes (small Hamming distance).
Similarity Score Calibration
Similarity score calibration maps raw similarity scores (from embeddings/rerankers) to more reliable confidence signals (e.g., probabilities or risk bands).
Similarity Search
Similarity search finds items most similar to a query under a similarity metric (cosine similarity, dot product, etc.), commonly used with embeddings.
Similarity Thresholding
Similarity thresholding sets cutoff values on similarity scores (embedding similarity, reranker scores) to decide actions like "use cache," "retrieve more," or "ask a clarifying question."
SimPO (Simple Preference Optimization)
A simplified version of DPO that works without a reference model and uses length-normalized reward.
Simpson's Paradox
Simpson's paradox is when a trend appears in multiple groups but reverses or disappears when the groups are combined, due to confounding and aggregation.
Simulation
The imitation of a real or hypothetical system or process in a controlled virtual environment.
Single Sign-On
Single Sign-On (SSO) lets users authenticate once via an identity provider and access multiple services without separate logins (often via SAML or OIDC).
Single Sign-On (SSO)
Single Sign-On (SSO) enables users to authenticate once with an identity provider (IdP) and access multiple applications without re-authenticating for each.
Sinusoidal Positional Encoding
The original positional encoding from the Transformer paper using sine and cosine functions of different frequencies.
Site Architecture
Site architecture is how pages are structured and linked (hierarchy, hubs, navigation, internal linking) to support discoverability and user journeys.
Sitemap
A sitemap (often XML) is a machine-readable file that helps search engines discover URLs and understand update patterns.
Skip Connection
Skip connections forward the input of a layer directly to the output of later layers – the core mechanism making 100+ layer deep networks trainable.
SLA (Service Level Agreement)
An SLA is a contractual commitment to service performance (e.g., uptime), often with remedies/credits if not met.
SLAM (Simultaneous Localization and Mapping)
An algorithm that enables a robot or vehicle to simultaneously determine its position and create a map of the environment.
SLI (Service Level Indicator)
An SLI is the measurable metric used to evaluate whether an SLO is being met (latency, error rate, correctness proxy, cost per answer).
Sliding Window Attention (SWA)
An attention variant where each token only attends to a limited number of previous tokens (window) instead of the entire sequence.
SLO (Service Level Objective)
An SLO is a target level of service performance/reliability (e.g., 99.9% availability, p95 latency < 2s).
Slot Filling
Extraction of specific parameters from user utterances for conversational AI.
Small Language Model
A Small Language Model (SLM) is a comparatively smaller LLM designed for lower latency, lower cost, and easier deployment—often used for narrow tasks or as part of a routed system.
Small Language Models
Language models with significantly fewer parameters than large LLMs (typically 1-7B instead of 100B+), optimized for specific tasks and capable of running locally or on edge devices.
SMOTE (Synthetic Minority Over-sampling Technique)
Algorithm that generates synthetic examples for the minority class by interpolating between existing data points.
Snorkel
Snorkel is a framework for programmatic data labeling that uses labeling functions instead of manual annotation to efficiently create large training datasets.
Snowflake
Snowflake is a cloud-native data warehouse platform that separates storage and compute, enabling scalable data analysis with SQL.
SOC 2
SOC 2 is an attestation framework focused on controls related to security, availability, processing integrity, confidentiality, and privacy.
Social Proof
Psychological principle where people follow the behavior of others.
Soft Prompt
A soft prompt is a learned vector representation (rather than human-written text) used to steer a model's behavior—often trained as a small set of prompt embeddings.
Softmax
Function that converts logits into probability distribution.
Software Bill of Materials (SBOM)
A Software Bill of Materials (SBOM) is an inventory of software components and dependencies used in a system (libraries, versions, suppliers).
Solomonoff Induction
Solomonoff induction is a theoretical framework for optimal prediction that combines Bayesian inference with algorithmic complexity, weighting hypotheses by how simply they describe the data.
Sora
OpenAI's revolutionary text-to-video model that generates photorealistic videos up to one minute from text descriptions.
Sora 2
The second generation of OpenAI's text-to-video model with improved quality, longer clips, and more realistic physics simulation.
Source Attribution
Source attribution is explicitly indicating where information came from (documents, URLs, internal systems), often via citations or links.
Source Grounding
Source grounding is constraining an AI system to base its answers on provided sources (retrieved documents, tools, or approved references) rather than unverified model knowledge.
Source Separation
Source Separation separates a mixed audio signal into individual sources – e.g., vocals, drums, bass, and instruments from a song.
Space Complexity
Space complexity describes how an algorithm's memory usage grows with input size (often using Big-O notation).
spaCy
Industrial-strength open-source NLP library in Python for tokenization, NER, POS tagging, dependency parsing, and more.
SPARQL
SPARQL is the W3C standard query language for RDF graphs, enabling structured queries over Knowledge Graphs and Linked Data.
Sparse Attention
Sparse attention reduces attention computation by allowing tokens to attend only to a subset of other tokens (patterned or learned sparsity).
Sparse Autoencoder
A Sparse Autoencoder (SAE) is an autoencoder trained with a sparsity constraint so that only a small subset of features activate for any given input.
Sparse Mixture of Experts (SMoE)
An architecture where only a small fraction of all "expert sub-networks" is activated per input – enabling huge model capacity with efficient inference.
Sparse Model
A neural network where only a small portion of weights or activations are used for each computation, significantly increasing efficiency.
Sparse Retrieval
Sparse retrieval uses sparse representations (often term-frequency based) such as BM25 to retrieve documents by lexical match.
Sparse Training
Training with sparsity from the start – instead of "train dense, then prune," the model stays sparse from the beginning with connections dynamically added/removed.
Speaker Diarization
Speaker diarization identifies "who spoke when" in an audio recording by segmenting audio into speaker-labeled turns.
Specificity
The proportion of correctly classified negative cases out of all actual negative cases.
Spectral Normalization
Spectral Normalization constrains the Lipschitz constant of network layers by normalizing with the largest singular value – standard stabilization in GANs.
Speculative Decoding
An inference acceleration technique where a small "draft model" quickly proposes multiple tokens and a large "verifier model" verifies them in parallel – up to 3x faster generation.
Speech Enhancement
Speech Enhancement improves speech recording quality by removing noise, reverb, and interference – often as preprocessing for ASR.
Speech Synthesis
Artificial generation of human speech from text (text-to-speech).
Speech-to-Text
Technology for converting spoken language into written text – the foundation for voice assistants and transcription.
Speech-to-Text (STT)
Speech-to-Text (STT) converts spoken audio into written text using automatic speech recognition (ASR) models.
Split Testing
Synonym for A/B testing - comparing variants for optimization.
SRE
Site Reliability Engineering (SRE) applies software engineering practices to operations to achieve reliable, scalable systems using SLOs, automation, and incident discipline.
Stability AI
The company behind Stable Diffusion, one of the most widely used open-source models for AI image generation.
Stable Diffusion
The leading open-source model for text-to-image generation, enabling local execution and fine-tuning on consumer hardware.
Stack
A Stack is a fundamental data structure following the LIFO principle (Last In, First Out), where the last added element is removed first.
Staging Environment
A staging environment is a pre-production environment designed to mirror production as closely as possible for final validation.
Stanza (Stanford NLP)
Stanford's Python NLP library with state-of-the-art neural models for tokenization, POS, NER, and parsing in 70+ languages.
State Space Model (SSM)
A class of sequence models based on continuous state space theory offering linear scaling O(N) instead of quadratic attention O(N²).
State Space Models (SSMs)
State Space Models (SSMs) are sequence models that maintain a latent "state" that evolves over time to process sequential data efficiently.
State Transition System
A state transition system models a system as states and transitions that move it from one state to another.
Statefulness
Statefulness describes whether a system retains information across interactions (stateful) or treats each request independently (stateless).
Static Site Generation
Static Site Generation (SSG) builds pages ahead of time into static HTML (often deployed on a CDN) for very fast delivery and high reliability.
Stationarity
A time series is stationary when its statistical properties remain constant over time.
Statistical Significance
Statistical significance describes the probability that an observed effect did not arise by chance — measured via the p-value against a defined threshold (usually 0.05).
Steering Vector
A steering vector is a direction in a model's internal representation space that, when added or applied to activations, can bias outputs toward or away from certain behaviors or attributes.
Stemming
Rule-based reduction of words to their stem by removing suffixes.
Step Decay (Learning Rate)
Simplest learning rate schedule strategy that reduces the LR by a factor after fixed intervals (epochs or steps).
Stochastic Gradient Descent (SGD)
Variant of gradient descent that uses only a mini-batch per update instead of all data – faster and often better generalizing.
Stochastic Parrot
Stochastic parrot is a critique framing that highlights how LLMs can generate fluent text by pattern-matching from training data without true understanding—raising concerns about bias, misinformation, and misuse.
Stochastic Weight Averaging (SWA)
Training technique that averages model weights over multiple checkpoints to find flatter minima and better generalization.
Stop Sequence
A stop sequence is a token/string pattern that tells a model to stop generating when encountered.
Stopword Removal
Removing high-frequency words without semantic content (the, a, is, and, of) from text before processing.
Stratified Sampling
Sampling method that ensures class/group proportions in the sample match the overall distribution.
Streaming (Token Streaming)
Outputting LLM tokens as they are generated instead of waiting for the complete response.
Streaming ASR
Streaming ASR transcribes speech in near real-time as audio arrives, rather than after the full recording is complete.
Streaming Data
Continuous data flow that is processed in real-time.
Streaming Responses
A technique where LLM responses are transmitted token by token, instead of waiting for complete generation – dramatically improves perceived latency.
STRIPS
STRIPS is a classical planning formalism where actions are defined by preconditions and effects (add/delete lists) over symbolic state predicates.
Structured Data
Structured data is machine-readable metadata (often JSON-LD) embedded in pages to help systems understand content entities and relationships.
Structured Logging
Structured logging records logs in a consistent, machine-parseable format (fields like request_id, tenant_id, route, model_version, latency_ms) rather than free-form strings.
Structured Output
Structured output is requiring the model to produce outputs in a predefined structure (JSON, YAML, sections with strict headings), often enforced with validation.
Structured Outputs
Techniques and API features that force LLMs to return responses in exactly defined formats like JSON schemas – essential for reliable AI integrations.
Structured Pruning
A pruning variant that removes entire structures (neurons, filters, attention heads, layers) instead of individual weights – delivers real speedups without specialized sparse hardware.
Style Transfer
Style transfer modifies an image (or text) to match a target style while preserving core content.
StyleGAN
StyleGAN is NVIDIA's groundbreaking GAN architecture that generates photorealistic faces and images with unprecedented control over style and details.
Subject Consistency
The ability of an AI image generator to consistently render characters and objects across multiple images.
Summarization
Summarization is generating a shorter representation of content while preserving key meaning—extractive (selecting parts) or abstractive (rewriting).
Super Resolution
Super resolution increases the resolution of images or videos using AI – reconstructing details not present in the original.
Superalignment
The research problem of how to make AI systems smarter than humans (superintelligence) safe and controllable.
Superposition
Superposition in neural networks describes how multiple features can be represented in overlapping directions within a limited-dimensional space, rather than one feature per neuron.
Supervised Learning
ML paradigm where the model learns from labeled examples (input-output pairs).
Superwise
An AI observability and monitoring platform that tracks performance using 100+ metrics and generates real-time incident reports.
Supply Chain Security
Supply chain security protects software and AI dependencies (libraries, containers, build pipelines, models, datasets) from tampering and compromise.
Supply-Side Platform (SSP)
A Supply-Side Platform (SSP) is technology that publishers use to automatically sell their ad inventory to ad exchanges and DSPs.
Surrogate Model
A simple, interpretable model that approximates a complex black-box model to explain its decisions.
Survival Analysis
Statistical method for analyzing time until an event occurs (e.g., churn, conversion, failure), accounting for censored data.
SWE-Bench (Software Engineering Benchmark)
A benchmark that tests LLMs by having them solve real bug reports from GitHub repositories – the most realistic test for AI coding abilities.
SwiGLU
An activation function for Transformer FFN blocks combining Swish gating with linear projection, standard in modern LLMs like LLaMA.
Sycophancy
Sycophancy is an LLM behavior where the model overly agrees with the user's stated beliefs or incorrect premises instead of correcting them.
Synthetic Data
Artificially generated data that replicates statistical properties of real data – used for training, testing, and privacy protection when real data is scarce, sensitive, or expensive.
Synthetic Media
Umbrella term for all media content (text, image, audio, video) that has been wholly or partially created or manipulated by AI.
Synthetic Monitoring
Synthetic monitoring runs automated, scripted checks to simulate user actions and detect failures before users report them.
SynthID
Google's technology for invisible digital watermarks in AI-generated images, videos, and audio for provenance marking.
System Prompt
A special prompt category that defines the base behavior, persona, and rules for an AI session.
T
Talking Head Generation
AI technology that generates a realistic video of a speaking person from a single portrait photo and audio input.
Tanh (Hyperbolic Tangent)
An activation function that maps values to the range [-1, 1] – zero-centered and smoother than sigmoid.
Taxonomy
A Taxonomy is a hierarchical classification system that organizes concepts, content, or entities into ordered categories and subcategories.
Technical SEO
Technical SEO is optimizing the technical foundation of a site so search engines can crawl, index, and render content efficiently (and users get fast, stable UX).
Technological Singularity
A hypothetical point at which technological progress (especially AI) becomes so rapid and profound that it fundamentally and unpredictably transforms human civilization.
Temperature
A parameter that controls randomness in LLM output.
Temperature (Sampling)
A parameter controlling the "creativity" of LLM outputs: Low values (0-0.3) produce focused, deterministic responses; high values (0.7-1.0) bring variation and surprises.
Temperature Scaling
A post-hoc calibration method that uses a single parameter (temperature) to adjust model confidence values.
Temporal Difference Learning (TD)
TD learning updates value estimates based on the difference between successive predictions – learns from incomplete episodes through bootstrapping.
Temporal Graph Network
A GNN for time-evolving graphs that models the evolution of nodes and edges over time.
Tensor Parallelism
A parallelization strategy that splits individual tensor operations (matrix multiplications) across multiple GPUs – necessary for layers too large for one GPU.
TensorRT-LLM
NVIDIA's optimized inference engine for LLMs that achieves maximum performance on NVIDIA GPUs through kernel fusion, quantization, and tensor parallelism.
Test-Time Compute
Compute that an LLM spends at inference time for extended reasoning instead of producing a direct answer.
Test-Time Training (TTT)
A paradigm where a model adapts to each new input during inference by optimizing a self-supervised loss on the test instance – "learning while predicting".
Text Classification
Automatically assigning texts to predefined categories using a machine learning model.
Text Generation
Text generation is the automatic creation of text by AI models, typically based on a prompt or context.
Text Normalization
Standardizing text data by converting to a uniform form – lowercasing, Unicode normalization, character replacement, and more.
Text Summarization
Automatically generating a shorter version of a text while retaining the most important information.
Text-to-3D
Text-to-3D generates three-dimensional objects and scenes from natural language text descriptions using AI.
Text-to-Image
AI generation of images from text descriptions – the breakthrough that democratized creative work.
Text-to-Speech
Technology for converting written text into natural-sounding speech – today mostly using neural models.
Text-to-Video
AI technology that generates complete videos with moving images, people, and scenes from text descriptions.
Textual Inversion
Textual Inversion learns a new word embedding for a concept from a few images, without modifying the diffusion model itself.
TF-IDF
Statistical measure for evaluating the relevance of a word in a document relative to a document collection.
TFX (TensorFlow Extended)
Google's end-to-end platform for deploying production-ready ML pipelines based on TensorFlow.
Thompson Sampling
Bayesian bandit algorithm that selects actions proportionally to the probability that they are optimal.
Threat Modeling
Threat modeling is a structured process for identifying assets, attack surfaces, threats, and mitigations to reduce security risk.
Throughput
The number of tokens or requests a system can process per time unit – a key measure for ML inference efficiency.
tiktoken
OpenAI's fast BPE tokenizer library for GPT models, written in Rust with Python bindings.
Time Complexity
Time complexity describes how an algorithm's runtime grows as input size increases, often expressed using Big‑O notation (e.g., O(log n), O(n), O(n²)).
Time Series
Sequence of data points ordered in time.
Time Series Analysis
Analysis of data points collected over time to identify patterns.
Time Series Foundation Model
Pre-trained Transformer models for time series enabling zero-shot forecasting without specific training.
Time-to-First-Token (TTFT)
The time from request to first generated token – critical for perceived responsiveness of AI applications.
TinyML
Machine learning on microcontrollers and ultra-low-power devices with just a few kilobytes of RAM – AI on a chip smaller than a coin.
TLS (Transport Layer Security)
TLS (Transport Layer Security) is a cryptographic protocol that secures network communication by providing encryption, integrity, and endpoint authentication.
Together AI
Cloud platform for training and inference of open-source AI models with optimized GPU infrastructure.
Tokenization
The process of breaking text into smaller units (tokens) that can be processed by language models – from whole words to syllables to individual characters.
Tool Use
The ability of LLMs to call external tools and APIs – from calculators to web search to databases and custom functions.
Top-k Sampling
A sampling parameter that restricts selection to the k most likely tokens, regardless of their absolute probabilities.
Top-p (Nucleus Sampling)
A sampling parameter that selects only from the most likely tokens whose cumulative probability does not exceed p.
Topic Modeling
Unsupervised ML method for discovering abstract topics in document collections.
TorchServe
PyTorch's official model serving framework for deploying PyTorch models in production.
Touchpoint
Every contact point between customer and brand on the customer journey.
Toxicity Detection
ML systems that automatically detect and classify toxic, offensive, or hateful content.
TPU (Tensor Processing Unit)
A specialized AI chip developed by Google, optimized for matrix multiplications in neural networks, working significantly more efficiently than GPUs for certain AI workloads.
Transfer Learning
Using knowledge learned from one task to improve performance on a related task.
Transformer
A neural network architecture that uses self-attention to model relationships between all positions in a sequence.
Transformer Architecture
The revolutionary neural network architecture from 2017 ("Attention Is All You Need") that replaced RNNs and forms the foundation of all modern LLMs like GPT, Claude, Gemini.
Transparency
The disclosure of how AI systems work, what data they use, and how decisions are made.
Treatment Effect (ATE/CATE)
The causal effect of an intervention (treatment) on an outcome. ATE is the average, CATE the conditional effect for subgroups.
Tree of Thoughts (ToT)
Prompting strategy where the LLM explores multiple reasoning paths in parallel, evaluates them, and selects the best – like a decision tree for thought chains.
Triplet Loss
A loss function for metric learning that uses anchor, positive, and negative samples to train embeddings so similar items are closer and different ones further apart.
Triton Inference Server
NVIDIA's open-source inference server for serving multiple ML models on GPU and CPU infrastructure with maximum performance.
Truncation
Truncation is cutting off data that exceeds a maximum length – whether text for LLMs, sequences for models, or decimal places.
Trust & Safety
Trust & Safety is the practice of protecting users, platforms, and brands from harmful content, abuse, and unsafe outcomes—through policy, enforcement, and product design.
Trust Boundary
A trust boundary is a point in a system where the level of trust changes (e.g., from untrusted user input to internal services).
Trust Models
A trust model defines who/what is trusted to make assertions (identity, integrity, authorization) and how that trust is established, delegated, and verified.
Trusted Execution Environment (TEE)
A hardware-based isolated environment that protects code and data during execution from the host system and other processes.
TruthfulQA
A benchmark that tests whether LLMs avoid popular misinformation and conspiracy theories.
Two-Tower Model
An architecture with two separate encoders (user tower, item tower) whose embeddings are efficiently matched via similarity search.
U
U-Net
U-Net is a network architecture for image segmentation with encoder-decoder structure and skip connections.
UAT (User Acceptance Testing)
User Acceptance Testing (UAT) is the final validation phase where real users confirm a system meets business requirements.
Ubiquitous Language
Ubiquitous language is a DDD practice where teams use a shared, precise vocabulary for core concepts.
UDF (User-Defined Function)
A UDF is a custom function to extend a platform (SQL engines, data warehouses).
UGC (User-Generated Content)
UGC is content created by users rather than the brand (reviews, comments, community posts).
Ultra-Long Context Window
An ultra-long context window is the ability to accept very large input contexts (tens or hundreds of thousands of tokens).
Unbounded Fan-Out
Unbounded fan-out: workflow spawns uncontrolled downstream calls (tools, retrieval, model calls).
Uncertainty Quantification (UQ)
UQ estimates how uncertain a model is about an output.
Uncertainty-Aware Routing
Uncertainty-aware routing chooses workflows based on uncertainty signals (low-confidence → deeper retrieval).
Underfitting
Underfitting happens when a model is too simple to capture patterns—poor performance on both training and test.
Unicode Normalization
Unicode normalization converts text into canonical form for consistent treatment.
Unified Search
Unified search: one search experience across multiple content sources (docs, tickets, wiki, CRM).
Uniform Information Density
Prompt principle: keep "importance per token" consistent, avoid low-value text.
Unigram Model (Tokenization)
Subword tokenization algorithm that starts with a large vocabulary and iteratively removes least useful tokens.
Unintended Memorization
Unintended memorization: models retain specific training examples and may reproduce them.
Unit Economics
Unit economics measures profitability per unit (customer, query, workflow) vs variable costs.
Unit Test
A unit test verifies the behavior of an isolated piece of code automatically in CI.
Universal Embeddings
Universal embeddings: general-purpose representations for many domains without domain-specific training.
Unlearning (Machine Unlearning)
Machine unlearning removes the influence of specific training data from a model (privacy, compliance).
Unstructured Data
Unstructured data is not stored in a predefined schema (PDFs, emails, chats, wikis, tickets).
Unsupervised Learning
ML paradigm where the model finds patterns in unlabeled data.
Untrusted Input Handling
Controls that treat external/user-provided content as potentially malicious.
Update Cadence
Update cadence is the planned frequency for content/system refreshes.
Update vs Upgrade
Update: minor, backward-compatible change. Upgrade: larger change with potential behavior changes.
Uplift Modeling
Uplift modeling predicts the incremental impact of an intervention (ad, email, CTA).
Upsert
An upsert updates a record if it exists or inserts it if it doesn't.
URL (Uniform Resource Locator)
A URL is the address of a web resource (scheme, domain, path, query parameters).
URL Canonicalization
URL canonicalization: multiple URL variants resolve to one canonical URL.
Usability Testing
Usability testing evaluates how easily users can complete tasks.
Usage Anomaly Detection
Identifies unusual patterns in user/tenant behavior (spikes, errors).
Usage Telemetry
Usage telemetry captures how a product is used (events, funnels, intent patterns).
Usage-Based Pricing
Usage-based pricing charges based on consumption (tokens, requests, tool calls).
Usage-Based Routing
Adapts model/workflow selection based on cost and consumption signals.
User Experience (UX)
UX is the overall quality of user interaction with a product.
User Intent
User intent is what a user is trying to accomplish with a query.
User Journey
A user journey is the sequence of steps to achieve a goal.
User Onboarding
User onboarding helps new users reach their first successful outcome quickly.
User Persona
A user persona is a representative profile of a target user segment.
User-Generated Content (UGC)
Content created by users such as reviews, posts, and videos.
Utility Function
A utility function maps outcomes to numeric values representing preference, enabling tradeoffs between competing objectives.
UTM Parameters
UTM parameters are query-string tags for marketing campaign attribution (utm_source, utm_medium).
UX Writing for Uncertainty
Product language that communicates confidence, limitations, and next steps when uncertain.
UXR (User Research)
User Research is systematic study of user needs and behaviors.
V
VAE (Variational Autoencoder)
VAE stands for Variational Autoencoder, a generative model that learns a probabilistic latent space for sampling and generation.
Validation Set
A validation set is a held-out dataset used during model development to tune hyperparameters and select model versions without touching the final test set.
Validator
A validator is a component that checks whether an input/output meets required constraints (schema, safety policy, semantics, permissions).
Value Alignment
Value alignment is ensuring an AI system's behavior reliably matches intended human/organizational values and constraints (safety, fairness, truth-seeking, privacy).
Value of Information (VoI)
Value of Information (VoI) quantifies how much benefit you gain by obtaining additional information before making a decision.
Value-Based Pricing
Value-based pricing sets price based on the value delivered to customers (outcomes), not purely on provider costs (tokens, compute).
Vanishing Gradient
Vanishing gradient is a training problem where gradients become extremely small as they propagate backward through a network, slowing or preventing learning in early layers.
Variance
Variance is the degree to which a model's performance changes across different datasets/samples; high variance often indicates sensitivity to training data (overfitting risk).
Variational Autoencoder (VAE)
A Variational Autoencoder (VAE) is a generative model that learns a probabilistic latent space, enabling sampling and generation of new data.
Vector Database
Specialized databases for storing and lightning-fast similarity search of high-dimensional vectors (embeddings) using Approximate Nearest Neighbor (ANN) algorithms.
Vector Database
A vector database stores embeddings and supports fast similarity search (nearest neighbors), often with metadata filtering and indexing for scale.
Vector Embedding
A vector embedding is a numerical representation (array of floats) of text, images, or other data that encodes semantic meaning in a high-dimensional space.
Vector Index
A data structure enabling efficient similarity search in high-dimensional vector spaces.
Vector Index
A vector index is the data structure/algorithm used to speed up nearest-neighbor search over embeddings at scale.
Vector Quantization
Vector quantization (VQ) compresses continuous vectors by mapping them to a finite set of representative vectors (a codebook).
Vector Search
Vector search retrieves items by similarity in an embedding space rather than exact keyword match.
Vector Similarity
Vector similarity is a measure of how close two embeddings are (commonly cosine similarity or dot product).
Vector Store
A vector store is the storage layer (database or service) that holds embeddings plus metadata for retrieval and similarity search.
Vector Store Hygiene
Vector store hygiene is the operational discipline of keeping a vector store accurate, secure, performant, and up-to-date (dedupe, versioning, ACL correctness, drift monitoring, purge workflows).
Vendor Risk Management
Vendor Risk Management (VRM) is assessing and managing risks introduced by third-party providers (security, privacy, compliance, continuity, and operational dependencies).
Veo 3
Google's third-generation video generation model with native audio, longer clips, and improved physics.
Verifiability
Verifiability is the property that claims can be checked against reliable sources, logs, or measurable evidence.
Verification
Checking whether LLM outputs are correct, factual, and source-supported.
Verification Layer
A verification layer is a system component that checks whether an AI output or action meets required correctness, safety, policy, and formatting constraints before it is delivered or executed.
Verification-Centric Agents
Agentic systems whose architecture actively verifies every reasoning step against external sources before feeding it into the next step.
Verification-First Policy
A verification-first policy requires AI outputs and high-impact actions to pass defined verification checks before being shown to users or executed.
Version Control
Version control tracks changes to code, configs, prompts, schemas, and content over time, enabling collaboration, rollbacks, and auditability.
Versioned Prompt
A versioned prompt is a prompt template managed like a software artifact: changes are tracked, tested, reviewed, and deployable with rollback.
Vibe Coding
A programming approach where developers describe their intentions in natural language and AI tools generate the code, while the developer guides the direction and refines the output.
Video AI
Video AI encompasses AI technologies for automatic analysis, generation, editing, and optimization of video content.
Viewability
Metric measuring whether an ad was actually visible.
Virtual Try-On
AI technology that lets customers virtually try fashion, beauty, or eyewear products on their own body.
Vision APIs
API interfaces enabling AI-powered image analysis – from simple object detection to complex scene understanding and multimodal reasoning.
Vision Language Models
AI models that can understand and process both images and text – they "see" and "read" simultaneously and can communicate about visual content.
Vision Transformer (ViT)
A Vision Transformer (ViT) applies transformer architectures to images by representing them as sequences of patch embeddings.
Vision-Language Model (VLM)
A Vision-Language Model (VLM) processes both images and text to perform tasks like image understanding, captioning, document Q&A, and multimodal reasoning.
Visual Question Answering (VQA)
AI systems that can answer questions about images in natural language – "How many people are in the photo?"
vLLM
A high-performance open-source inference server for LLMs that uses PagedAttention for efficient KV-Cache management and maximum throughput.
Vocabulary (NLP)
The complete set of all tokens that a language model knows and can process.
Vocoder
A vocoder converts Mel spectrograms or other acoustic features into audible audio waveforms – the final step in TTS pipelines.
Voice Activity Detection
Voice Activity Detection automatically detects whether an audio signal contains human speech – the foundation for efficient speech processing.
Voice Agent
Voice Agents are AI-powered speech systems that autonomously conduct natural phone or voice conversations – from outbound calls to customer service hotlines.
Voice Cloning
AI technology that analyzes a human voice from just seconds of audio and synthetically reproduces it to speak any text in that voice.
Voice Search
Search using spoken language through assistants and devices.
VQ-VAE
VQ-VAE is a variant of VAE that uses vector quantization to learn discrete latent representations via a learned codebook.
W
WAF (Web Application Firewall)
A Web Application Firewall (WAF) filters and monitors HTTP traffic to protect web apps from attacks (e.g., injection, abuse, bot traffic).
Walled Garden
A walled garden is a closed ecosystem where a platform controls access to data, distribution, and measurement (common in advertising and analytics).
Warm Start
A warm start initializes training or optimization from a previously learned state (weights, embeddings, or parameters) rather than starting from scratch.
WASM (WebAssembly)
WebAssembly (WASM) is a binary instruction format that enables near-native performance code to run in the browser (and other runtimes).
Watermarking
Watermarking is adding a detectable signal to content (text, image, audio, video) to indicate origin, authenticity, or provenance—often used to mark AI-generated outputs.
Wav2Vec
Wav2Vec is a self-supervised learning framework from Meta for speech representations that learns from raw audio and achieves state-of-the-art ASR with minimal labeled data.
Weak Supervision
Weak supervision uses imperfect, noisy, or indirect signals (heuristics, rules, distant labels) to create training labels instead of manual annotation.
Weakly Supervised Learning
Weakly supervised learning trains models using weak supervision signals (noisy labels, partial labels, aggregated labels) rather than fully reliable labels.
Weavy
AI video platform with node-based editor for complex generative video workflows and multi-model pipelines.
Web Browsing Tool
A web browsing tool is an AI tool integration that fetches live web pages or search results to answer questions with up-to-date information.
Web Grounding
The ability of an AI model to access web search results in real-time to generate current and factually accurate content.
Web Scraping
Web scraping is programmatically extracting data from websites for analysis, indexing, or monitoring.
Webhook
A webhook is an event-driven HTTP callback where one system sends another system data when something happens (e.g., "ticket created," "payment succeeded").
Webhook Verification
Webhook verification ensures incoming webhook requests are authentic and untampered, typically using HMAC signatures, timestamps, and replay protection.
Weight Decay
Weight decay is a regularization technique that discourages large weights during training, often implemented as L2 regularization or decoupled weight decay (e.g., in AdamW).
Weight Initialization
Weight initialization determines the starting values of network parameters – critical for stable training and fast convergence.
Weight Normalization
Weight Normalization reparameterizes weight vectors into direction and magnitude – an alternative to batch norm without batch dependency.
Weight Sharing
A technique where multiple parts of a neural network use the same weights – significantly reducing parameter count and memory usage.
Weights & Biases (W&B)
SaaS platform for experiment tracking, model evaluation, dataset versioning, and collaborative ML development.
WER (Word Error Rate)
Word Error Rate (WER) measures speech recognition accuracy as the proportion of substitutions, deletions, and insertions needed to transform a transcript into the ground truth.
What-If Analysis
What-if analysis explores how outcomes change when you alter inputs, assumptions, or decisions.
Whisper
An open-source speech recognition model from OpenAI trained on 680,000 hours of multilingual audio.
Windowed Attention
Windowed attention restricts attention to a local token window instead of the full sequence, reducing compute and enabling longer contexts.
Windsurf
An AI-powered code editor by Codeium offering deep context awareness and agentic coding assistance.
WinoGrande
A benchmark for pronominal reference resolution where small word changes flip the correct answer.
Word Embedding
A dense vector representation of a word that encodes its semantic meaning.
Word Error Rate (WER)
The standard metric for speech recognition – measures substitutions, deletions, and insertions relative to the reference.
Word2Vec
Word2Vec is a technique for generating word embeddings that represents words as dense vectors, where semantically similar words have similar vectors.
WordPiece
Subword tokenization algorithm developed by Google that maximizes training corpus likelihood.
Workflow Automation
Workflow automation uses software (often with AI) to execute repetitive tasks or business processes with minimal manual intervention.
Workflow Orchestration
Workflow orchestration coordinates multi-step processes across services/tools, managing state, retries, timeouts, and error handling.
Workload Isolation
Workload isolation separates workloads so one workload can't degrade another's performance, security, or cost (e.g., interactive vs batch).
World Model
An internal representation of the environment in an AI system that enables predictions about future states and the effects of actions.
Write Amplification
Write amplification is when a system performs much more internal writing than the size of the user's write request (common in storage engines and log-structured systems).
Write-Back Cache
A write-back cache writes changes to the cache first and flushes them to the backing store asynchronously later.
Write-Through Cache
A write-through cache writes data to both the cache and the backing store synchronously on every write.
X
X-Content-Type-Options
X-Content-Type-Options: nosniff is an HTTP header that instructs browsers not to "MIME sniff" a response and to respect declared content types.
X-Forwarded-For
X-Forwarded-For is an HTTP header used to identify the originating client IP address when a request passes through proxies or load balancers.
X-Frame-Options
X-Frame-Options is an HTTP response header that helps prevent clickjacking by controlling whether a page can be embedded in an iframe.
X-Robots-Tag
X-Robots-Tag is an HTTP header that gives robots directives (like noindex, nofollow) similar to meta robots tags—useful for non-HTML resources.
x-Vector
An x-vector is a type of speaker embedding used in speech processing to represent speaker identity characteristics in a fixed-length vector.
X.509 Certificate
An X.509 certificate is a digital certificate standard used for public key infrastructure (PKI), enabling TLS and identity verification.
X.509 Certificate
An X.509 certificate is a digital certificate standard used in PKI to bind a public key to an identity, enabling secure authentication and encrypted communication (e.g., TLS).
xAI
Elon Musk's AI company developing Grok – an LLM with real-time access to X (Twitter) and an uncensored, humorous style.
XAI (Explainable AI)
Explainable AI (XAI) is the set of methods and practices used to make an AI system's outputs more understandable—showing why a prediction, recommendation, or decision happened.
Xavier Initialization (Glorot Initialization)
Xavier (Glorot) initialization is a weight initialization method designed to keep activations and gradients in a healthy range as they flow through a neural network.
XDR (Extended Detection and Response)
XDR is a security approach that unifies detection and response across endpoints, networks, identities, cloud workloads, and more.
XGBoost
XGBoost (Extreme Gradient Boosting) is a high-performance ensemble learning algorithm that combines gradient boosting with decision trees for superior prediction accuracy.
XLA
XLA (Accelerated Linear Algebra) is a compiler for machine learning computations that optimizes operations and compiles them for various hardware platforms (CPU, GPU, TPU).
XLA (Accelerated Linear Algebra)
XLA is a compiler for linear algebra computations (commonly associated with TensorFlow and JAX) that optimizes execution by fusing operations and improving hardware utilization.
XLM (Cross-lingual Language Model)
XLM refers to cross-lingual language modeling approaches and model families designed to represent and process multiple languages in a shared embedding space.
XLM-R (Cross-lingual RoBERTa)
XLM-R is a multilingual transformer model family often used for cross-lingual understanding tasks (classification, NER, semantic similarity).
XLNet
XLNet is a transformer-based language model approach that uses permutation-based training to capture bidirectional context while preserving autoregressive properties.
xLSTM (Extended LSTM)
A modernized LSTM variant by Sepp Hochreiter using exponential gating and matrix memory to compete with Transformers.
XML (Extensible Markup Language)
XML is a markup language for representing structured data using nested tags.
XML Sitemap
An XML sitemap is a machine-readable list of URLs (with optional metadata like lastmod) that helps search engines discover and crawl content efficiently.
XOR Cipher
A XOR cipher is a simple encryption method that combines plaintext with a key using XOR; by itself it is generally not secure unless used correctly in specific forms.
XOR Problem
The XOR problem is a classic example showing that a single linear classifier cannot separate data that is not linearly separable.
XPath
XPath is a language for selecting nodes in an XML/HTML document using path expressions.
XQuery
XQuery is a query language for extracting and transforming data stored in XML documents.
XSLT
XSLT is a language used to transform XML documents into other formats (XML, HTML, plain text).
XSRF Token
An XSRF token (often synonymous with CSRF token) is a secret value used to prevent Cross-Site Request Forgery attacks.
XSS (Cross-Site Scripting)
Cross-Site Scripting (XSS) is a web vulnerability where attackers inject malicious scripts into content that is later served to other users.
XSS in AI-Generated Markdown
XSS in AI-generated markdown is the risk that markdown produced by an AI system can contain content that becomes executable when rendered.
XSS Payload
An XSS payload is the injected script or markup an attacker uses to exploit a cross-site scripting vulnerability.
XSS Prevention Patterns
XSS prevention patterns are design and engineering practices that prevent cross-site scripting by ensuring untrusted content cannot execute as code in a user's browser.
xUnit
xUnit refers to a family of unit testing frameworks (e.g., JUnit, NUnit, pytest, xUnit.net) that standardize how automated tests are written and executed.
XXE (XML External Entity)
XXE is a vulnerability where an XML parser processes external entities in a way that can expose sensitive data, trigger SSRF-like behavior, or cause denial of service.
Y
Y-Axis Compression
Y-axis compression is a visualization issue where scaling choices flatten differences, making changes look smaller (or larger) than they are.
Y-Combinator
The Y-combinator is a concept from lambda calculus that enables recursion in languages that don't have named self-references.
YAML
YAML ("YAML Ain't Markup Language") is a human-readable data serialization format commonly used for configuration files.
YAML Anchors and Aliases
YAML anchors and aliases let you define reusable blocks (anchors) and reference them elsewhere (aliases) to avoid repetition.
YAML Front Matter
YAML front matter is a YAML block at the start of a content file (often Markdown) used to store metadata (title, tags, canonical URL, updated date).
YAML Injection
YAML injection is when untrusted input is interpreted as YAML and causes unintended behavior—often through unsafe deserialization or config templating.
YAML Schema Validation
YAML schema validation checks that a YAML file conforms to an expected structure and constraints (fields, types, required keys, enums).
YARA Rule
A YARA rule is a pattern-matching rule used in cybersecurity to identify malware or suspicious artifacts in files and memory.
YARN
YARN (Yet Another Resource Negotiator) is a resource management layer in the Hadoop ecosystem for scheduling and running distributed applications.
Yield
Yield is the proportion of inputs that successfully produce acceptable outputs (e.g., successful runs, valid records, passing artifacts).
Yield Management
Yield management (often called revenue management) is pricing and inventory control to maximize revenue under capacity constraints.
Yield Optimization
Maximizing return from limited resources through data-driven decisions.
Yield Rate
Yield rate is yield expressed as a percentage over a defined population and time window.
YOLO
YOLO ("You Only Look Once") is a family of real-time object detection models that predict bounding boxes and class probabilities in a single pass.
Yottabyte
A yottabyte (YB) is a unit of data equal to 10²⁴ bytes (a septillion bytes).
YoY (Year-over-Year)
Year-over-Year (YoY) compares a metric to the same period in the previous year (e.g., Jan 2026 vs Jan 2025).
YTD (Year-to-Date)
Year-to-Date (YTD) measures performance from the start of the current year up to today.
Yule–Simpson Paradox
The Yule–Simpson paradox (often called Simpson's paradox) occurs when a trend appears in several groups but reverses or disappears when the groups are combined.
Z
Z-Index
z-index is a CSS property that controls the stacking order of overlapping elements on a web page (which layer appears on top).
Z-Layer Architecture
Z-layer architecture is an informal term teams use to describe layered stacks where each layer provides a specific responsibility (UI → API gateway → policy → orchestration → tools/data).
Z-Order Curve
A Z-order curve (Morton order) is a space-filling curve that maps multi-dimensional data into a one-dimensional ordering while preserving locality.
Z-Ordering
Z-ordering is the practice of physically organizing stored data using Z-order curves so that related values are colocated on disk.
Z-Score
A z-score is the number of standard deviations a data point is from the mean.
Z-Test
A z-test is a statistical hypothesis test used to determine whether a sample mean differs from a known population mean (or whether two means differ) under certain assumptions.
ZeRO (Zero Redundancy Optimizer)
ZeRO is a set of techniques for training very large models efficiently by partitioning optimizer states, gradients, and parameters across devices—reducing memory redundancy.
ZeRO (Zero Redundancy Optimizer)
A memory optimization for distributed training that shards optimizer states, gradients, and parameters across GPUs instead of replicating – enables training of trillion-parameter models.
Zero Trust
Zero Trust is a security model that assumes no implicit trust—every request must be authenticated, authorized, and continuously evaluated, regardless of network location.
Zero-Click Content
Content that delivers its core message directly in the search result or AI answer without requiring a click to the source.
Zero-Click Search
Search queries where users get the answer directly in search results – without clicking to a website.
Zero-Day Vulnerability
A zero-day vulnerability is a security flaw unknown to the vendor or without an available patch at the time it is exploited.
Zero-ETL
Zero-ETL refers to architectures that minimize or eliminate traditional ETL pipelines by enabling near-direct data access/replication between systems with low operational overhead.
Zero-Knowledge Proof (ZKP)
A zero-knowledge proof (ZKP) is a cryptographic method that lets one party prove a statement is true without revealing the underlying data.
Zero-Party Data
Zero-party data is data a customer intentionally and proactively shares with a brand (preferences, intents, goals), rather than inferred or tracked.
Zero-Shot Classification
Zero-shot classification assigns labels to text without training a task-specific classifier, usually by using natural language label descriptions.
Zero-Shot Learning
The ability of an AI model to perform tasks or recognize classes for which it has seen no explicit examples during training.
Zero-Shot Prompting
Zero-shot prompting is prompting a model with instructions and constraints without providing explicit examples.
Zero-Shot vs Few-Shot
Zero-shot uses no examples; few-shot includes a small number of examples in the prompt to steer behavior.
Zettabyte
A zettabyte (ZB) is a unit of data equal to 10²¹ bytes.
Zettelkasten
Zettelkasten is a knowledge-management method based on atomic notes and dense linking between ideas to build a scalable "knowledge graph" of concepts.
Zipf's Law
Zipf's law describes how, in many natural datasets (language, queries), a few items are extremely frequent while most items are rare (long-tail distribution).
Zipping Artifacts
Zipping artifacts bundles files (models, configs, logs, datasets, build outputs) into a compressed archive for storage, transport, or deployment.
ZK-SNARK
ZK-SNARK is a type of zero-knowledge proof designed to be succinct and efficiently verifiable.
ZK-STARK
ZK-STARK is a type of zero-knowledge proof designed to be transparent (no trusted setup) and scalable, often with different performance tradeoffs than SNARKs.
ZKML (Zero-Knowledge Machine Learning)
ZKML refers to applying zero-knowledge proof techniques to machine learning so one can prove properties about ML inference/training without revealing sensitive inputs or model internals.
Zod
Zod is a TypeScript-first schema validation library used to define and validate data structures at runtime.
Zombie Process
A zombie process is a process that has finished execution but still has an entry in the process table because its parent hasn't reaped its exit status.
Zonal vs Regional Services
Zonal services operate within a single availability zone; regional services span multiple zones within a region.
Zone Redundancy
Zone redundancy is deploying services across multiple availability zones to remain resilient if one zone fails or degrades.
ZooKeeper (Apache ZooKeeper)
Apache ZooKeeper is a distributed coordination service used for configuration management, leader election, and distributed locks.
Zstandard (zstd)
Zstandard (zstd) is a fast compression algorithm designed to provide high compression ratios with low CPU overhead.
ZTNA (Zero Trust Network Access)
ZTNA is a zero-trust approach to granting network access based on identity and context, often replacing legacy VPN patterns with app-level access controls.