GPU Buying Guide India 2026
Buying data-centre GPUs in India is not as straightforward as ordering from an e-commerce site. Between import regulations, GST compliance, hardware compatibility, and the rapidly evolving product landscape, there are many decisions to get right. This guide walks you through the GPU options available in India in 2026 and how to choose the right one for your workload.
The Current NVIDIA Data-Centre GPU Lineup
As of 2026, NVIDIA’s data-centre GPU portfolio for AI and HPC includes several product lines. Here is a practical breakdown:
NVIDIA H100 (Hopper Architecture)
- Form factors: SXM5 (700W, NVLink) and PCIe Gen5 (350W)
- VRAM: 80 GB HBM3, 3.35 TB/s bandwidth (SXM5)
- Peak performance: 3,958 TFLOPS FP8 / 1,979 TFLOPS FP16 (SXM5)
- Best for: LLM training, large-scale distributed training, high-throughput inference
- Indian pricing range: INR 25-35 lakh per GPU (SXM5), INR 18-25 lakh per GPU (PCIe)
The H100 remains the workhorse GPU for serious AI training in 2026. The SXM5 variant is strictly superior for multi-GPU workloads due to NVLink, but the PCIe variant offers a lower entry point for single- or dual-GPU inference setups.
NVIDIA H200 (Hopper Architecture, Memory Upgrade)
- Form factor: SXM5 (700W)
- VRAM: 141 GB HBM3e, 4.8 TB/s bandwidth
- Peak performance: Same compute as H100 SXM5
- Best for: Large-context LLM inference, models that need more VRAM per GPU
- Indian pricing range: INR 35-45 lakh per GPU
The H200 uses the same Hopper GPU die as the H100 but doubles the memory to 141 GB with faster HBM3e. This is particularly valuable for inference workloads where KV-cache size limits context length and batch size.
NVIDIA A100 (Ampere Architecture)
- Form factors: SXM4 (400W, NVLink) and PCIe Gen4 (300W)
- VRAM: 40 GB or 80 GB HBM2e, 2.0 TB/s bandwidth (80 GB SXM)
- Peak performance: 312 TFLOPS FP16 (SXM, with sparsity)
- Best for: Budget-friendly training, inference, HPC simulation
- Indian pricing range: INR 8-15 lakh per GPU (80 GB, depending on source)
The A100 is the previous generation and now available at significant discounts. It remains an excellent GPU for fine-tuning, inference, and HPC workloads where the latest FP8 support is not needed.
NVIDIA L40S (Ada Lovelace Architecture)
- Form factor: PCIe Gen4 (350W)
- VRAM: 48 GB GDDR6X, 864 GB/s bandwidth
- Peak performance: 733 TFLOPS FP8 / 366 TFLOPS FP16
- Best for: Inference, visual computing, video processing, multi-tenant GPU serving
- Indian pricing range: INR 10-15 lakh per GPU
The L40S is a versatile GPU for inference workloads that do not require HBM bandwidth or NVLink connectivity. Its 48 GB VRAM can serve models up to ~25B parameters in FP16, and quantised models up to ~70B parameters.
NVIDIA L4 (Ada Lovelace Architecture)
- Form factor: PCIe Gen4 Low-Profile (72W)
- VRAM: 24 GB GDDR6, 300 GB/s bandwidth
- Peak performance: 242 TFLOPS INT8
- Best for: Low-power inference, video transcoding, AI at the edge
- Indian pricing range: INR 3-5 lakh per GPU
The L4 is a low-profile, single-slot GPU that fits in standard 1U and 2U servers. Its low TDP makes it ideal for dense inference deployments where many GPUs are packed into a single rack.
How to Choose: Decision Framework
For LLM Training (pre-training or full fine-tuning)
- First choice: H100 SXM5 in an 8-GPU HGX configuration with InfiniBand for multi-node
- Budget alternative: A100 80 GB SXM in an HGX configuration (2-3x slower per GPU but significantly cheaper)
For LLM Inference (serving models in production)
- Models up to 13B parameters: L40S (48 GB VRAM, good throughput)
- Models 13B-70B parameters: H100 PCIe or A100 80 GB (sufficient VRAM, high bandwidth)
- Models 70B+ parameters: H100/H200 SXM5 with NVLink (multi-GPU model parallelism)
For LoRA / QLoRA Fine-Tuning
- Models up to 13B: A single A100 40/80 GB or L40S
- Models 70B: 2-4x A100 80 GB or 2x H100
For Computer Vision / Video Processing
- L40S is the sweet spot. Good VRAM, solid FP16 performance, and hardware video encode/decode
Procurement Considerations in India
Import and Taxation
- Data-centre GPUs are typically imported. BCD is 0% under ITA-1 for servers and computing equipment.
- IGST of 18% applies but is creditable as input tax credit for GST-registered businesses.
- Ensure BIS compliance for all imported IT equipment.
Warranty and Support
- NVIDIA does not sell GPUs directly to end customers. You must buy through authorised channel partners or OEM server vendors (Supermicro, Dell, HPE, ASUS, Gigabyte).
- GPU warranty is tied to the server OEM warranty. Ensure your supplier provides valid warranty documentation.
- Ask about advance replacement policies for GPU failures. Waiting 4-6 weeks for an RMA can be devastating for a training cluster.
Compatibility Verification
- GPUs must be matched with compatible server chassis, motherboards, and power supplies.
- SXM5 GPUs require specific HGX baseboards. They cannot be installed in arbitrary servers.
- PCIe GPUs need sufficient slot clearance (full-height, full-length, typically double-width) and adequate PSU wattage.
- Verify BIOS and firmware compatibility with your chosen GPU model.
Lead Times
- Expect 4-8 weeks for most GPU server configurations in India.
- H200 availability remains tighter than H100 as of early 2026.
- Plan procurement 2-3 months ahead of your deployment date.
What Not to Buy
- Consumer GPUs (RTX 4090, RTX 5090) for production workloads. They lack ECC memory, have restrictive EULA terms for data-centre use, limited VRAM (24 GB), and no hot-swap or remote management capabilities.
- Older-generation GPUs (V100, T4) for new training deployments. The performance gap versus current-gen is too large to justify, even at low prices.
- Unbranded or grey-market GPUs: warranty and support will be non-existent, and BIS compliance is questionable.
Working with rawcompute.in
We simplify GPU procurement in India:
- All major NVIDIA data-centre GPUs in stock or short lead times
- Complete server configurations from Supermicro, Dell, ASUS, and Gigabyte
- GST-compliant invoicing with proper HSN codes for smooth ITC claims
- BIS-registered products for hassle-free customs clearance
- Pre-configured and tested before delivery. Your server arrives ready to boot
- Colocation coordination with Tier 3 data centres across India
Contact us with your workload description and we will recommend the optimal GPU configuration and provide an all-inclusive Indian pricing quote.
Need this for your infrastructure? Let's talk.
We help teams across India spec and deploy hardware.