- SUT ID: STAC240903a
- STAC-ML
LLM inferencing on Paperspace cloud with 8x NVIDIA A100 GPUs running the Llama-3.1-8B-Instruct model
Type: Audited
Specs: STAC-AI™ LANG6
Stack under test:
- STAC-AI™ Reference Implementation for vLLM OpenAI Server
- vllm/vllm-openai:v0.5.5 Docker Container
- Python 3.11.7, CUDA 12.2
- Ubuntu Linux 20.04.3 LTS
- Paperspace Cloud A100-80Gx8 VM
- 8 x NVIDIA A100-SXM4-80GB GPUs
- 2 x Intel® Xeon® Gold 6342 CPU @ 2.80 GHz
- 720 GiB of virtualized memory
Please log in to see file attachments. If you are not registered, you may register for no charge.