Back

What an H100-hour actually costs, and why neoclouds price 40% lower

Mar 8, 2026

4 min

When teams compare GPU pricing across providers, the gap is striking. An H100 GPU-hour costs $2.43 to $2.63 on most neoclouds. The same H100 hour costs $7.43 to $7.52 on AWS, GCP, or Azure. That is a 3x difference for what looks like the same silicon. The instinct is to assume the cheaper price hides something, like worse availability, hidden fees, or weaker support. Usually it does not, because the gap is structural.

What you're actually paying for in an H100 hour

Roughly four costs sit inside a GPU-hour. The first is the GPU itself, which is the amortized capital cost of the H100 plus its share of the host server, NIC, and rack hardware. The second is the facility, including power, cooling, real estate, network ingress, and the depreciation on the physical building. The third is the operating platform, meaning provisioning systems, monitoring, support, and the engineers running it. The fourth is the broader platform, which for hyperscalers means everything else they offer, from IAM to managed analytics to global region replication.

On a neocloud, the first three are the entire cost stack. On a hyperscaler, the fourth is allocated across every compute hour they sell, including your GPU instances. That is where the gap comes from.

Why hyperscalers can't just match the price

A hyperscaler running an H100 fleet at $2.50 per hour would need to fundamentally restructure its margin model. The platform investments that justify hyperscaler pricing, including hundreds of managed services, global region coverage, and multi-decade compliance certifications, are not optional for the customers using them. They are load-bearing for the rest of the cloud business.

For workloads that use that broader platform, the higher GPU price is a reasonable allocation. For workloads that only need the GPU, you are subsidizing services you do not use.

Spot pricing tells a different story

The clearest signal of true marginal cost shows up in spot pricing. H100 SXM5 spot capacity is currently available around $0.80 to $1.03 per hour across major neoclouds. Hyperscalers run spot H100s in roughly the same range when they have inventory, which they often do not because their on-demand contracts consume their capacity first.

Spot prices reveal the floor of what a GPU-hour really costs. Permanent on-demand pricing reveals what each provider believes they can charge above that floor.

Reserved capacity changes the calculation again

For sustained workloads, meaning anything running more than 50% of the month, reserved or committed pricing brings the per-hour cost down further. Neoclouds typically offer 30 to 50% discounts on annual commitments, and pricing on 3-year terms can land below $1.50 per hour for H100 SXM. Hyperscalers offer reserved discounts too, but they apply to a higher starting price.

When the gap doesn't matter

If a workload runs intermittently, depends heavily on adjacent managed services, or carries regulatory requirements that only the hyperscalers have certified, the price gap may be the right cost. Those are real reasons to pay a premium.

If a workload runs continuously, uses GPUs as its dominant cost line, and treats compute as fungible, which describes most foundation model training and a growing share of production inference, paying 3x for branding is hard to justify.

Where Aolani Cloud fits

Aolani Cloud prices GPU capacity at neocloud rates because we run a neocloud cost structure. We do not carry the overhead of a hundred adjacent managed services, so we do not charge for them. The savings show up in your training budget, your inference unit economics, and your ability to run more experiments per dollar.

See other articles

Bare metal vs. virtualized GPUs: the 15-25% tax you're paying for convenience

Mar 8, 2026

Author

Time

MFU is the only GPU efficiency metric that matters during training

Mar 8, 2026

Aolani Cloud Team

4 min

MFU is the only GPU efficiency metric that matters during training

Mar 8, 2026

Aolani Cloud Team

4 min

MFU is the only GPU efficiency metric that matters during training

Mar 8, 2026

Aolani Cloud Team

4 min

Scale AI Infrastructure from Chip to Cluster

Access GPU cloud and bare metal compute designed for teams building the next generation of AI in the region.