Original Post
Computer scientists and mathematicians use functions to make predictions: give me input arguments, I'll tell you the value of that function in simple regression models. Give me values of your function at previous few arguments, and I'll tell you its values at next arguments in the autoregressive models. I can also write a Boolean function, implement a finite state machine, or event-driven datapath based on UML Statecharts, that provides the computational power of parallel reactive programming, the same that runs your CPU and GPU. You can run von Neumann architecture code on it, proved Turing complete.
What's new about Transformer is it's not "just a function that for given input returns a computed output." It's something NEW. Computation-power wise, it can GENERATE unexpected end of the ceiling you're staring at and tell you that's where exactly the street starts!
I only know another mathematical concept that's capable behaving the same way: a function, implemented by a Human Software Engineer.
#LLM #selfattention #computerscience #Turingcomplete