AI

Nvidia Spectrum-X Colossus: the Ethernet bet powering xAI and Anthropic

Nvidia Spectrum-X Colossus powers xAIs 200,000-GPU site for Anthropic. The Ethernet vs InfiniBand bet and what UK enterprises should learn from it.

Anthropic SpaceX deal xAI Colossus 1 supercomputer NVIDIA Spectrum-X

IMAGE CREDITS: IMAGE: NVIDIA / XAI

Nvidia Spectrum-X Colossus is the 10 May infrastructure story that explains how the world’s biggest AI training site actually runs, and the technical follow-up to the 6 May Anthropic-SpaceX capacity deal. Nvidia confirmed that Spectrum-X Ethernet networking is the backbone connecting the 200,000-plus GPUs at xAI’s Memphis site, with native RDMA over Ethernet replacing the InfiniBand fabric most hyperscalers still default to.

Key facts
  • Nvidia Spectrum-X Colossus deployment links over 200,000 Hopper and Blackwell GPUs at the xAI Memphis facility using Ethernet rather than InfiniBand.
  • Spectrum-X uses BlueField-3 SuperNICs and Spectrum-4 switches running RoCE (RDMA over Converged Ethernet) for near-InfiniBand performance.
  • xAI quotes 1.6x improved AI workload performance vs traditional Ethernet at Colossus scale.
  • Spectrum-X is Nvidia’s bet that Ethernet wins the AI fabric war as hyperscalers re-tool around standardised networking.
  • The networking choice was a precondition for Anthropic’s 6 May capacity deal because Claude inference traffic is bursty and benefits from the architecture.

Nvidia Spectrum-X Colossus: the Ethernet bet behind 200,000 GPUs

AI training clusters are bandwidth-bound long before they are compute-bound. A 200,000-GPU site has to move terabytes per second between racks during back-propagation, gradient sync and checkpoint writes; if the network stalls, the GPUs sit idle, and idle Hopper or Blackwell GPUs are the most expensive paperweights in modern tech. Most hyperscalers historically solved this with InfiniBand – Mellanox’s specialised fabric, also owned by Nvidia after the 2019 acquisition. xAI’s Colossus 1 chose differently. The site runs on Nvidia Spectrum-X, which is an Ethernet stack engineered to behave like InfiniBand for AI workloads: lossless transport, adaptive routing, congestion control tuned for collective communication patterns.

The Nvidia Spectrum-X Colossus choice matters because it sets a precedent. If Ethernet can carry an AI workload at 200,000-GPU scale, every hyperscaler operations team can defend a future build on standardised, multi-vendor networking. That is the bet Nvidia is making publicly: that Ethernet ends up the winning AI fabric, and Spectrum-X is the path that gets the company paid either way. The deployment also explains why Anthropic was willing to relocate Claude inference to Colossus 1 within a single month – the networking pattern matches Anthropic’s existing AWS and Google Cloud expectations more closely than an InfiniBand-only site would.

Nvidia Spectrum-X Colossus context Anthropic Code with Claude announcement
Image: Anthropic

How Spectrum-X actually works at Colossus scale

Spectrum-X combines two Nvidia products: the Spectrum-4 800Gb/s Ethernet switch family, and BlueField-3 SuperNICs at every GPU server. BlueField-3 offloads the RDMA Over Converged Ethernet (RoCE) stack from the host CPU, which means every Hopper or Blackwell server is talking to its peers without the host kernel becoming the bottleneck. Spectrum-4 switches do per-flow adaptive routing – if one network path gets congested, traffic re-routes packet-by-packet rather than flow-by-flow, which is the trick that closes the historic InfiniBand performance gap.

At Colossus 1 the practical outcome is 1.6x AI workload performance versus traditional Ethernet, per xAI’s own measurement. That figure is conservative for marketing; independent operations teams who have run mixed-vendor Ethernet AI clusters report similar or slightly better numbers when Spectrum-X is paired with BlueField-3 SuperNICs end to end. The architecture also means the data centre operator can mix workloads – training on one tier, inference on another – over the same physical fabric, which is exactly what Colossus needs to do now that Anthropic shares the building. The wider AI compute crunch context is covered in our Anthropic Amazon AWS £79 (about $100)B deal piece and the Samsung memory trillion-dollar valuation piece.

Video: Maximilian Schwarzmuller (AI infrastructure context)

Spectrum-X vs InfiniBand: what Nvidia is actually selling

AttributeInfiniBand (Quantum-2)Spectrum-X (Spectrum-4 + BlueField-3)
Port speed (current generation)400Gb/s NDR800Gb/s Ethernet
Network typeProprietary InfiniBandStandards-compatible Ethernet
Latency (typical)~1 microsecond~2 microseconds with RoCE
Multi-vendor optics / cablesLimitedYes, standard Ethernet ecosystem
Adaptive routingYes, hardware-acceleratedYes, packet-level via Spectrum-4
Workload mix supportStrong for training, weaker for inference / mixedDesigned for mixed training + inference (Colossus pattern)
Deployment exampleMicrosoft Azure, Meta SuperClusterxAI Colossus, Anthropic-shared capacity
Nvidia Spectrum-X Colossus xAI data centre infrastructure detail
Image: NVIDIA / xAI

Why UK enterprises should pay attention to Nvidia Spectrum-X Colossus

The UK angle is not direct – no UK data centre runs at Colossus scale – but the implications matter for any UK enterprise designing AI inference infrastructure in 2026. If your team is choosing between an Nvidia DGX SuperPOD style InfiniBand reference architecture and a multi-vendor Ethernet design with Spectrum-X switches, the Colossus 1 deployment is the public proof point that Ethernet can carry the workload. UK colocation providers – Equinix LD7 and LD8, Pulsant, VIRTUS – have been quietly upgrading to 800GbE Spectrum-4-class switches since late 2025, in part because customers asked for the option.

The second UK implication is supply. Mellanox / Nvidia InfiniBand lead times remain long, and Cambridge-based UK research teams have repeatedly cited the wait as a reason to switch architectures mid-project. Spectrum-X using standard Ethernet optics gives a more sourceable bill of materials. That matters for HPC labs at Cambridge, Oxford, ETH Zürich and Edinburgh that are building AI clusters now and cannot wait six months for InfiniBand cables. The broader hardware story sits next to the Nvidia physical AI push and the Nvidia CUDA-Q quantum AI move we covered earlier this year.

The third implication is about who controls AI fabric standards. Cisco, Arista and Broadcom all sell competing 800GbE AI Ethernet platforms (the Ultra Ethernet Consortium is the industry body). Nvidia’s Spectrum-X is the most vertically integrated of the lot, because Nvidia also sells the GPU at the other end of the cable. The Colossus deployment is the loudest endorsement Spectrum-X has had, and it lands at a moment when UK and European AI infrastructure budgets are being signed off for the next 18 months. UK CIOs evaluating networking suppliers will be reading the Nvidia release closely.

MTW verdict

Nvidia Spectrum-X Colossus is the strongest public proof yet that Ethernet beats InfiniBand for next-generation AI fabric. UK enterprises designing inference clusters in 2026 should treat Spectrum-X as the default option and ask suppliers to justify any non-standard choice. The Anthropic capacity story is the headline; the networking story is the lesson.

Buyer action

Where to buy or check next

Use this as the final check before ordering a phone, changing network or trusting a headline monthly price.

Stay in the loop

Get MTW reporting, reviews, guides, and buying advice in your inbox.

Subscribe

Reader discussion

Leave a comment

Comments are moderated. Keep it useful, accurate, and on topic.

Join the discussion

Your email address will not be published. All comments are held for moderation.

Spam protection

Keep reading

Today on MTW

The latest stories moving through the newsroom.