Attention
Applies to: python, general
Attention lets a model weight other positions by relevance when computing a representation.
weights = softmax(Q @ K.T)
out = weights @ V
See also: transformer, embedding
Applies to: python, general
Attention lets a model weight other positions by relevance when computing a representation.
weights = softmax(Q @ K.T)
out = weights @ V
See also: transformer, embedding