Language & LLMs

What Is a Transformer Decoder?

A transformer decoder produces an output sequence step by step, using masked self-attention so each position can only attend to earlier tokens. It is the core of autoregressive language models like the GPT family. Decoders may also use cross-attention to reference encoder outputs in sequence-to-sequence models.

What Is a Transformer Decoder?

Related topics

Further reading