University · Artificial Intelligence · AI Applications and Deployment

Model Compression, Pruning, Quantisation, and Knowledge Distillation

4 Abschnitte

Techniques for reducing model size and inference latency: unstructured and structured pruning, post-training quantisation, quantisation-aware training, INT8/FP16 inference, knowledge distillation, weight sharing, and low-rank factorisation.

Inhaltsübersicht

Why Model Compression Matters: Deployment Constraints and Goals
Pruning: Unstructured and Structured Approaches
Quantisation: Post-Training and Quantisation-Aware Training
Knowledge Distillation, Weight Sharing, and Low-Rank Factorisation

shears, scissors, garden, gardening, vegetable garden, nature, prune, pruning, pruning scissors — Pixabay – Pixabay License

📚 Vollständiges Lernmaterial mit 4 Abschnitten, Karteikarten und Quizzen verfügbar nach Anmeldung.

Jetzt kostenlos lernen →

Interaktiv lernen mit Karteikarten & Quizzen

Melde dich an und lerne AI Applications and Deployment mit intelligenten Wiederholungen, Quizzen und KI-Lernhilfen. 7 Tage kostenlos.

Kostenlos testen

Inhaltsübersicht

Related Topics

Interaktiv lernen mit Karteikarten & Quizzen