Cerebras CS-3 wafer-scale 1e6 cores 125 PFLOPS 25kW (30kA)

- 4e12 transistors in 46,225 mm^2 (8.65e7 transistors/mm^2), basically an entire wafer squared off.

- 5 nm process in current third gen.

- RAM is spread out "IRAM"-style.

- Most of the server packaging space is taken up by DC power supplies.

- No decoupling capacitors.

- Executes dummy operations to prevent "inductive transient"-like backlash from changing current draw too quickly. [I cry in reversible computing.]

- Custom core, not ARM.

- "~20x faster" than NVIDIA GPUs.

- In customer production use now including US National Labs.

- 2 WSE-3 systems per rack, so that's 50 kW/rack and therefore needs watercooling or rack-dedicated airhandling.