NVIDIA Nemotron 3 Ultra Now Available on Amazon SageMaker JumpStart

What's happening

On June 4, 2026, NVIDIA announced day-zero availability of Nemotron 3 Ultra on Amazon SageMaker JumpStart, the managed machine learning service within Amazon Web Services. Day-zero availability means the model became accessible to enterprise developers on AWS infrastructure simultaneously with its general release, rather than following a delayed integration period. The model is optimized in NVFP4 format and is classified as an open model, allowing enterprises to deploy it within AWS environments without proprietary access restrictions.

Nemotron 3 Ultra is built on a hybrid Transformer-Mamba Mixture-of-Experts architecture, comprising 550 billion total parameters with 55 billion active parameters at inference time. The model supports a context window of up to one million tokens, a specification relevant for long-document processing and complex agentic workflows. According to NVIDIA, the model delivers 5x faster inference speeds and up to 30% lower costs compared to reference benchmarks for agentic workloads, with the NVFP4 optimization format contributing to those efficiency gains.

Why it matters for markets

The commercial availability of Nemotron 3 Ultra on SageMaker JumpStart represents a concrete expansion of NVIDIA's software and model ecosystem beyond its hardware business. NVIDIA, which reported revenue of $253.49 billion and carries a market capitalization of approximately $5.30 trillion, has increasingly positioned its AI model portfolio alongside its GPU hardware as a mechanism for deepening enterprise relationships. By achieving day-zero placement on AWS — the cloud platform that anchors Amazon's $2.73 trillion market capitalization — NVIDIA gains distribution reach across one of the largest enterprise cloud customer bases without requiring those customers to manage independent infrastructure deployments.

For AWS, the integration adds a frontier-class reasoning model to SageMaker JumpStart's catalog at the moment of the model's public availability, a competitive differentiator in the enterprise AI platform market. The claimed 30% cost reduction for agentic workloads is a commercially significant figure for enterprises evaluating total cost of ownership for large-scale AI deployments, as agentic applications — which involve multi-step autonomous task execution — tend to generate high inference volumes. The 5x inference speed improvement, if sustained at production scale, further affects the economics of latency-sensitive enterprise applications. Amazon reported $742.78 billion in total revenue, with AWS representing its highest-margin segment, making AI model availability a strategic lever for cloud consumption growth.

Sectors and assets to watch

The primary tickers directly implicated in this development are NVDA and AMZN. NVIDIA's model availability on AWS infrastructure reinforces the company's dual role as both a hardware supplier and an AI software ecosystem provider, a strategic positioning that extends its commercial surface area beyond GPU sales cycles. Amazon's AWS division, which competes directly with Microsoft Azure and Google Cloud in the enterprise AI platform market, benefits from the addition of a high-specification open model to its managed deployment service.

More broadly, the enterprise AI infrastructure sector warrants attention, as day-zero model deployments on major cloud platforms have become a competitive benchmark. Companies operating managed AI deployment services, model hosting platforms, and agentic workflow tooling are affected by the cadence and specifications of model releases from major AI developers. The 550-billion-parameter scale of Nemotron 3 Ultra, combined with its MoE architecture that activates only 55 billion parameters per inference pass, reflects an industry-wide trend toward efficiency-optimized large models — a dynamic relevant to cloud providers, AI chip designers, and enterprise software vendors building on top of foundation models.

What to watch next

Key developments to monitor include enterprise adoption rates of Nemotron 3 Ultra through SageMaker JumpStart and whether NVIDIA extends day-zero availability agreements to other major cloud platforms such as Microsoft Azure or Google Cloud. The commercial terms governing open model distribution on managed cloud services — including pricing structures, usage-based billing, and any fine-tuning or customization capabilities offered through SageMaker — will shape how broadly enterprises integrate the model into production agentic workflows. Additionally, NVIDIA's cadence of future Nemotron model releases and their placement within cloud marketplaces will serve as an indicator of how the company is prioritizing software ecosystem expansion alongside its hardware roadmap.