Let’s dive in and look at TPU internals from the bottom up.
I’ve been working with TPUs a lot recently and it’s fun to see how they had such different design philosophies compared to GPUs.
The main strongpoint for TPUs is in their scalability. This is achieved through a co-design of both the hardware side (e.g. energy efficiency and modularity) and the software side (e.g. XLA compiler).
To give a brief tldr on TPUs, it’s Google’s ASIC that focuses on two factors: extreme matmul throughput + energy efficiency.