AI · Flashcard

What is knowledge distillation?

  • ATraining a small student model to imitate a larger teacher model
  • BTraining a large model on the outputs of many smaller models
  • CRemoving low-importance weights to shrink an already-trained model
  • DLowering the precision of a model's weights to save on memory

Why this is the answer

Distillation transfers behavior from a big teacher into a compact student that is cheaper to run. Removing weights is pruning; lowering precision is quantization.

Official docs
Study in Gnoseed →