STAC-ML Markets (Inference) on GroqNode with 8x GroqCard Accelerators

Purpose-built ML chip delivers first STAC-ML Markets (Inference) results.

31 October 2022

STAC recently completed the first STAC-ML Markets (Inference) Benchmark tests on a vendor-optimized stack. Groq's solution included 8 GroqCard acclerators in a GroqNode. The STAC Report is now available here.

STAC-ML Markets (Inference) is the technology benchmark standard for solutions that may be used to run inference on realtime market data. Designed by quants and technologists from some of the world's leading financial firms, the benchmark reports the performance, resource efficiency, and quality of any technology stack capable of performing inference using the provided models. In this project, we ran the fixed-window benchmark suite (code named Sumaco).

The stack consisted of the STAC-ML Pack for GroqWare (Rev A) using the GroqWare™ SDK 0.9.0.5 and Python 3.8.15 on a GroqNode™ server with 8 x GroqCard™ Accelerators.

Groq wished to highlight several results from this report:

For small model LSTM_A, across 1, 2 and 4 simultaneously running model instances (NMI):

Worst case 99th percentile latency was 56.4 μsec
(STAC-ML.Markets.Inf.S.LSTM_A.4.LAT.v1)
99th percentile latencies varied 1% (from 55.9 to 56.4 μsec)
(STAC-ML.Markets.Inf.S.LSTM_A.[1,2,4].LAT.v1)
The widest spread from minimum to 99th percentile latency was 6% (53.4 to 56.4 μsec)
(STAC-ML.Markets.Inf.S.LSTM_A.4.LAT.v1)

For large model LSTM_C, across all NMI tested:

Worst case 99th percentile latency was 2.27 ms
(STAC-ML.Markets.Inf.S.LSTM_C.8.LAT.v1)
99th percentile latencies varied by 2% (from 2.72 to 2.77 ms)
(STAC-ML.Markets.Inf.S.LSTM_C.[1,2,4,8].LAT.v1)
The widest spread from minimum to 99th percentile latency was 3% (2.68 to 2.77 ms)
(STAC-ML.Markets.Inf.S.LSTM_C.8.LAT.v1)

For details, please see the report at the link above. Premium subscribers have access to the code used in this project, as well as micro-detailed configuration information and detailed performance, quality and efficiency analysis and visualizations for the solution. To learn about subscription options, please contact us.

About STAC News

Read the latest about research, events, and other important news from STAC.

More News

Vault Report: STAC-A2 Risk Computation on 2x Intel 6980P Processors with RDIMMs

STAC Report: STAC-A2 Pack for oneAPI (Rev R) with 2 x Intel Xeon 6980P Processors, Micron MRDIMMs and Red Hat Enterprise Linux 9.5

Research Note: Comparing LLM Benchmarking Frameworks

STAC Research Note: Performance And Efficiency Comparison Between Self-Hosted LLMs And API Services

STAC Report: Extending STAC-ML with Gradient Boosted Tree Models

You are here

STAC-ML Markets (Inference) on GroqNode with 8x GroqCard Accelerators

About STAC News

Subscribe to notifications of research, events, and more.

More News