LAXIMA.
Hacker News11h ago187 points53 commentslive

Performance per dollar is getting faster and cheaper

Wafer achieved 2626 tok/s/node on AMD MI355X for GLM-5.2, over 2x cheaper than Blackwell, using MXFP4 quantization and speculative decode optimizations.

Read the full story atwafer.ai

Why this is in the Signal

LAXIMA AI Signal curates the highest-velocity stories across Hacker News, GitHub trending, and new Hugging Face / Replicate model releases — quality-filtered, deduplicated, and refreshed every four hours. This item surfaced from Hacker News with 187 points (by latchkey). We link straight to the original source above — see the full live feed.

More AI Signal briefs

Get the Signal