Question 1

What is Structured Pruning?

Accepted Answer

A pruning variant that removes entire structures (neurons, filters, attention heads, layers) instead of individual weights – delivers real speedups without specialized sparse hardware. Unlike unstructured pruning (zeroing individual weights), structured pruning removes contiguous blocks: entire convolutional filters, attention heads, or even layers. The resulting model is a genuinely smaller model without sparse representation.

Question 2

How does Structured Pruning work?

Accepted Answer

Unlike unstructured pruning (zeroing individual weights), structured pruning removes contiguous blocks: entire convolutional filters, attention heads, or even layers. The resulting model is a genuinely smaller model without sparse representation.

Question 3

Why is Structured Pruning important for marketing?

Accepted Answer

Structured pruning is the most practically relevant pruning method since standard hardware (GPUs, CPUs) directly benefits from smaller models – no sparse support needed.

Question 4

How is Structured Pruning used in practice?

Accepted Answer

LLM-Shearing (2023) selectively removes attention heads and FFN dimensions from Llama-2 7B, producing a 1.3B model that outperforms 1.3B models trained from scratch.

Question 5

What are common mistakes with Structured Pruning?

Accepted Answer

Coarser granularity than unstructured pruning – may compress less. Harder to optimize which structures are removable. Requires retraining/fine-tuning after pruning.

Question 6

Where does Structured Pruning come from?

Accepted Answer

Li et al. (2016) introduced filter pruning for CNNs. For transformers, head pruning was studied by Michel et al. (2019) – they showed many attention heads are removable. LLM-Shearing (2023) scaled this to LLMs.

Structured Pruning

Explanation

Marketing Relevance

Example

Common Pitfalls

Origin & History

Comparisons & Differences

Structured Pruning vs. Unstructured Pruning

Structured Pruning vs. Knowledge Distillation

Further Resources

Related Services

Related Terms