Furiosa: 3.5x efficiency over H100s

https://news.ycombinator.com/rss Hits: 11
Summary

Built around our RNGD accelerators, NXT RNGD Server is an optimized system that delivers high performance on today’s most important AI workloads while fitting seamlessly into existing data center environments. With NXT RNGD Server, enterprises can move from experimentation to deployment faster than ever. The system ships with the Furiosa SDK and Furiosa LLM runtime preinstalled, so applications can serve immediately upon installation. We optimized the platform over standard PCIe interconnects, eliminating the need for proprietary fabrics or exotic infrastructure. Designed for compatibility, NXT RNGD Server runs at just 3 kW per system, allowing organizations to scale AI within the power and cooling limits of most modern facilities. This makes NXT RNGD Server a practical and cost-effective system to build out AI factories inside the data centers enterprises already operate. Technical Specifications Compute: Up to 8 × RNGD accelerators (4 petaFLOPS FP8 per server) with dual AMD EPYC processors. Supports BF16, FP8, INT8, and INT4Memory: 384 GB HBM3 (12 TB/s bandwidth) plus 1 TB DDR5 system memoryStorage: 2 × 960 GB NVMe M.2 (OS), 2 × 3.84 TB NVMe U.2 (internal)Networking: 1G management NIC plus 2 × 25G data NICsPower & Cooling: 3 kW system power, redundant 2,000 W Titanium PSUs, air-cooledSecurity & Management: Secure Boot, TPM, BMC attestation, dual management paths (PCIe + I2C)Software: Preinstalled Furiosa SDK and Furiosa LLM runtime with native Kubernetes and Helm integration Real-world benefits and proven performance NXT RNGD Server’s superior power efficiency significantly lowers businesses’ TCO. Enterprise customers can run advanced AI efficiently at scale within current infrastructure and power limitations – using on-prem servers or cloud data centers. This is crucial for leveraging existing infrastructure, since more than 80% of data centers today are air-cooled and operate at 8 kW per rack or less. For businesses with sensitive workloads, regulatory complianc...

First seen: 2026-01-15 01:13

Last seen: 2026-01-15 11:16