Rent GPU
NVIDIA H200

The NVIDIA H200 GPU supercharges generative AI and high-performance computing (HPC) workloads with game-changing performance and memory capabilities. delivering massive throughput improvements for generative AI and large-scale inference workloads.

Memory

141GB

GPU RAM

Bandwidth

4.8TB/s

Memory Bandwidth

Form factor

SXM & NVL

Architecture

Interconnect

900GB/s

NVLink Switch

Accelerated
Workload & Computing

H200 is purpose-built for next-generation AI models, enabling faster inference, reduced latency, and lower total cost for large-scale deployments across data centers and cloud platforms. 1.6x faster for GPT-3 175B Inference, 1.9X Faster for Llama2 70B Inference, 110X Faster for High-Performance Computing

Ultra Large Model Inference
High Throughput AI Serving
Generative AI at Scale

Architecture Comparison

Select the form factor optimized for your specific workload

Feature	H200 SXM	H200 NVL
architecture	Hopper	Hopper
memory	141GB HBM3e	141GB HBM3e
memory Bandwidth	4.8 TB/s	4.8 TB/s
recommended For	Better Tensor Cores for Advanced Generative AI & LLM Inference	High Performance Computing

Rent GPU NVIDIA H200