General_Effort , 2 months ago It’s all just weights and matrix multiplication and tokenization See, none of these is statistics, as such. Weights is maybe closest but they are supposed to represent the strength of a neural connection. This is originally inspired by neurobiology. Matrix multiplication is linear algebra and encountered in lots of contexts. Tokenization is a thing from NLP. It's not what one would call a statistical method. So you can see where my advice comes from. Certainly there is nothing here that implies any kind of averaging going on.
It’s all just weights and matrix multiplication and tokenization
See, none of these is statistics, as such.
Weights is maybe closest but they are supposed to represent the strength of a neural connection. This is originally inspired by neurobiology.
Matrix multiplication is linear algebra and encountered in lots of contexts.
Tokenization is a thing from NLP. It's not what one would call a statistical method.
So you can see where my advice comes from.
Certainly there is nothing here that implies any kind of averaging going on.