Original Post
I just came up with an idea of use a linear autoregression to model the nonlinear autoregression generating function of a heavily nonlinear process (yes, three degrees of indirection!) and found out it was the generalization of the Hilbert's 13th problem caled "Embedding Theorem" invented in 1936 (sic!)
I suspect the word "embdedding" in the transformer model came not from k-NN, but from back then.
Fair not is that it wasn't possible to use until 1981.
#transformer #attention #gpt #llm #embedding #vectordatabase #vectorsearch #nonlinear #autoregression #linearregression