Language & LLMs
What Is Model Pruning?
Model pruning reduces the size of a neural network by removing weights, neurons, or other components that contribute little to its output. This can lower memory use and speed up inference. Pruning is often followed by fine-tuning to recover any lost accuracy.
Further reading
Read more about model pruning — articles and blogs from around the web: