Original Post
To do AI arithmetic well on chips, all you need to master are CSA and PPA. Everything else is just Harel UML Statecharts, meaning you must already be an expert in one-hot FSM (if you're doing HW at all!) Don't forget power planes, ground-decoupled crossovers, H-trees, FIFO, and CDC, but those are trivial technicalities not much patent-protected and there was less intentional deception in education has being committed (thanks Intel, AMD, Nvidia, especially UofA 😅).
Keep in mind IEEE 754 was considered the most complex datapath by its creators, don't let the fused multiply-add struggle to put your AI chip ambition down. Google Brain's BFLOAT16 format makes things easier to wrap your head around, and training in it was proven to converge for Transformers.
PM if you want to learn more. I can teach you online fully pipelined backpropagation, if you got a huge chip area like Cerebras Systems
#aichips #asic #dsp #bfloat16 #fma #mac #fpga #ai #silicon #chips