Infrastructure & Agents
What Is Tensor Parallelism?
Tensor parallelism partitions the math inside a single layer across several devices, dividing large weight matrices among them. It is a form of model parallelism used to fit and speed up very large layers.
Further reading
Read more about tensor parallelism — articles and blogs from around the web: