Transformer
Applies to: python, general
A transformer is a neural architecture built around attention, feed-forward layers, residual connections, and normalization.
tokens -> attention -> MLP -> logits
See also: attention, embedding, tokenization
Applies to: python, general
A transformer is a neural architecture built around attention, feed-forward layers, residual connections, and normalization.
tokens -> attention -> MLP -> logits
See also: attention, embedding, tokenization