- SUT ID: NVDA250610b
- STAC-AI
STAC-AI™ LANG6 on NVIDIA GH200 Grace Hopper Superchip
Type: Unaudited
Specs: STAC-AI™ LANG6
NVIDIA recently performed two STAC-AI™ LANG6 (Inference-Only) benchmark runs using a QuantaGrid S74G-2U server, equipped with the GH200 Grace Hopper Superchip.
Stack under test:
- Llama-3.1-70B
- STAC-AI Pack for NVIDIA TensorRT-LLM
- TensorRT-LLM release v0.17.0
- Hardware stack – NVIDIA GH200 Grace Hopper Superchip
This particular report is for the Llama-3.1-70B-Instruct model.
The companion report for Llama-3.1-8B-Instruct can be found here: https://www.STACresearch.com/NVDA250610a
Note: None of the results have been audited by STAC.
Premium subscribers have access to extensive visualizations of all test results, the detailed configuration information for the solutions tested, the code used in this testing, and the ability to run these same benchmarks – as is, or with other models and data sets - in the privacy of their own labs. To learn about subscription options, please contact us.