STAC-AI™ LANG6 on NVIDIA GB200 Grace Blackwell

Type: Unaudited

Specs: STAC-AI™ LANG6

NVIDIA recently performed two STAC-AI™ LANG6 (Inference-Only) benchmark runs using on a Nebius 4xGB200VM showcasing Grace Blackwell GB200 NVL72

Stack under test:

  • Llama-3.1-8B
  • STAC-AI Pack for NVIDIA TensorRT-LLM
  • TensorRT-LLM release v0.19.0
  • Hardware stack – Nebius 4xGB200VM - NVIDIA GB200 NVL72

This particular report is for the Llama-3.1-8B-Instruct model.

The companion report for Llama-3.1-70B-Instruct can be found here: https://www.STACresearch.com/NVDA250714b

Note: None of the results have been audited by STAC.

Premium subscribers have access to extensive visualizations of all test results, the detailed configuration information for the solutions tested, the code used in this testing, and the ability to run these same benchmarks – as is, or with other models and data sets - in the privacy of their own labs. To learn about subscription options, please contact us.

Please log in to see file attachments. If you are not registered, you may register for no charge.

The STAC-AI Working Group focuses on benchmarking artificial intelligence (AI) technologies in finance. This includes deep learning, large language models (LLMs), and other AI-driven approaches that help firms unlock new efficiencies and insights.