CoreWeave Brings NVIDIA Vera Rubin NVL72 to Production

CoreWeave, Inc. (Nasdaq: CRWV) announced that it has successfully brought up and validated NVIDIA’s Vera Rubin NVL72 rack‑scale system on the CoreWeave Cloud. This milestone not only expands CoreWeave’s catalog of NVIDIA hardware but also showcases a suite of purpose‑built software and engineering innovations that make the platform practical for enterprise AI teams operating at production scale. By delivering a fully vetted, end‑to‑end solution, CoreWeave positions itself as the first AI‑cloud provider able to offer the performance, efficiency, and operational depth required for the emerging “agentic” era of AI, where models run continuously, reason across massive context windows, and demand unprecedented inference throughput.

CoreWeave Completes Bring‑Up and Validation of Vera Rubin NVL72

CoreWeave became the first AI cloud provider to stand up a fully validated Vera Rubin NVL72 rack. The system comprises 72 NVIDIA Rubin GPUs and 36 NVIDIA Vera CPUs per rack, linked by a 260 TB/s NVIDIA NVLink 6th‑generation fabric. In internal testing, the architecture delivers up to 10× better inference per watt, up to one‑fourth fewer GPUs, and one‑tenth the cost per million tokens compared with the prior‑generation NVIDIA Blackwell 1 platform. CoreWeave completed system‑level validation of the entire rack‑scale architecture, confirming that the hardware operates reliably under production workloads. The validation process included stress‑testing under sustained mixed‑precision inference, measuring latency consistency across the full 1.6 Tb/s per‑GPU backend bandwidth, and verifying that the integrated BlueField‑4 DPUs maintain tenant isolation when multiple customers run concurrent jobs. These results demonstrate that the Vera Rubin stack can meet the stringent uptime and performance guarantees demanded by large‑scale model training and inference pipelines.

Purpose‑Built Infrastructure Enables Rack‑Scale AI

To make Vera Rubin usable at scale, CoreWeave introduced several new hardware‑and‑software components that together form a cohesive “Mission Control” ecosystem:

Valvey – Software‑Defined Liquid Cooling – Valvey is CoreWeave's programmable per‑rack valve assembly that turns cooling from a passive mechanical system into a software‑defined, rack‑level control surface. Integrated into Mission Control, Valvey continuously monitors flow rate, temperature, pressure, and leak detection. When an anomaly is detected, the system can automatically isolate the affected loop, trigger an emergency shutdown, or schedule maintenance without disrupting neighboring racks that share the same cooling infrastructure. This dynamic approach not only protects hardware but also reduces overall cooling energy consumption.
Racky – Unified Rack Control Appliance – Racky aggregates power distribution, cooling metrics, and environmental sensors into a single management interface. By presenting each Vera Rubin rack as a cloud‑native resource rather than a bespoke hardware build, Racky enables automated provisioning, scaling, and health‑checking through standard APIs. Operators can therefore treat the rack like any other compute node in the CoreWeave fleet, applying policies for workload placement, fault tolerance, and cost optimization.
Multi‑Rail, Multi‑Plane Networking – CoreWeave supports both NVIDIA Quantum‑X800 InfiniBand and NVIDIA Spectrum‑X Ethernet with RoCE. The non‑blocking, multi‑rail fabric delivers 1.6 Tb/s of backend bandwidth per GPU and scales to configurations of hundreds of thousands of GPUs across two network tiers. This architecture ensures that even the most bandwidth‑hungry models—such as those with trillion‑parameter counts or multi‑modal token windows—receive consistent, low‑latency data paths, eliminating network bottlenecks that traditionally limit inference throughput.
Secure, Scalable AI Cloud Operations – Integration of NVIDIA BlueField‑4 DPUs offloads critical infrastructure services, including virtual switching, storage acceleration, and security enforcement. By handling these functions at the DPU level, CoreWeave reduces host‑CPU overhead, improves data‑access latency, and strengthens tenant isolation for multi‑tenant AI workloads. The DPUs also enable fine‑grained policy enforcement for data residency and compliance, which is essential for regulated industries adopting generative AI.

CoreWeave’s engineering team worked closely with Dell Technologies, which supplied PowerEdge XE9812 servers as the platform’s backbone, and Micron, which provided 7600 SSD storage that is among the first liquid‑cooled NVMe solutions deployed at rack scale. Dell’s servers were specifically engineered for the density and precision required by the 72‑GPU configuration, while Micron’s liquid‑cooled SSDs contribute to overall energy efficiency and thermal management, reinforcing the benefits introduced by Valvey.

Enterprise Relevance and Early Customer Feedback

Quantitative‑research head Craig Falls of Jane Street highlighted the importance of reliable, high‑performance infrastructure for large‑scale model training. He noted that CoreWeave’s “full cluster observability” and deep support have enabled faster training runs and shorter iteration cycles. Falls emphasized that the ability to monitor every node, network link, and cooling valve in real time translates directly into reduced time‑to‑insight for research teams that must iterate on trillion‑parameter models.

CoreWeave’s executive vice president of Product & Engineering, Chen Goldberg, framed the launch in the context of the “agentic era,” describing a shift toward workloads that reason continuously, scale unpredictably, and operate 24/7 in production. Goldberg argued that only infrastructure with the depth of engineering embodied by Valvey, Racky, and the multi‑rail fabric can sustain such demands without sacrificing efficiency or reliability.

NVIDIA’s vice president of Hyperscale and HPC, Ian Buck, praised CoreWeave’s end‑to‑end approach, stating that the company’s “full‑stack, end‑to‑end approach to Vera Rubin… is how the world’s most ambitious AI teams will push the next AI frontier.” This endorsement underscores the strategic alignment between NVIDIA’s hardware roadmap and CoreWeave’s operational expertise.

CoreWeave also cited record‑breaking MLPerf benchmark results, a Platinum ranking in both SemiAnalysis ClusterMAX 1.0 and 2.0, and a #1 ranking for inference speed and price‑performance for Moonshot AI’s Kimi K2.6 in independent benchmarking by Artificial Analysis. These independent metrics validate the claimed efficiency gains—up to 10× better inference per watt and a ten‑fold reduction in cost per million tokens—under real‑world, production‑grade conditions.

Key Takeaways

CoreWeave is the first AI cloud provider to bring up and fully validate NVIDIA Vera Rubin NVL72, a rack‑scale system with 72 GPUs and 36 CPUs per rack.
The Vera Rubin platform promises up to 10× better inference per watt, up to one‑fourth fewer GPUs, and one‑tenth the cost per million tokens versus NVIDIA Blackwell 1.
CoreWeave introduced proprietary innovations—Valvey cooling, Racky unified control, multi‑rail networking, and BlueField‑4 DPU security—to make the rack‑scale system production‑ready for enterprise AI teams.

TechInsyte's Take

CoreWeave’s delivery of a validated Vera Rubin rack demonstrates that cloud providers can now offer the next generation of NVIDIA hardware with the operational depth required for large‑scale, continuous‑inference workloads. Buyers should monitor how quickly enterprise AI teams adopt the platform and whether the claimed efficiency gains translate into measurable cost reductions in real‑world deployments. Further details on pricing, availability, and integration timelines remain to be disclosed.

Source: Businesswire