Jump to Navigation

SUT ID: STAC241122b
STAC-AI

LLM inferencing on Paperspace cloud with 8x NVIDIA H100 GPUs running the Llama-3.1-70B-Instruct model

Type: Audited

Specs: STAC-AI™ LANG6

Stack under test:

STAC-AI™ Reference Implementation for vLLM OpenAI Server
vllm/vllm-openai:v0.5.5 Docker Container
Python 3.11.7, CUDA 12.2
Ubuntu Linux 22.04.3 LTS
Paperspace Cloud H100x8 VM

8 x NVIDIA H100-80GB-HBM3 GPUs
2 x Intel® Xeon® Platinum 8458P CPU @ 2.70 GHz
1.6TB of virtualized memory

Please log in to see file attachments. If you are not registered, you may register for no charge.