SUT ID: note230215
STAC-ML

STAC Research Note: STAC-ML Markets (Inference) Processor Comparisons in Azure

We looked at six SUTs comprising large Microsoft Azure VMs featuring processors from 3 processor vendors. For each processor, we tested both latency- and throughput-optimized configurations of the STAC-ML naive inference implementation with ONNX as the inference engine. We used the STAC-ML Test Harness to quickly and easily find the optimal configurations for each SUT.

There was not a single “winner”. Performance and business use-case analyses showed that each VM carved out optimal price-performance niches at various points along the latency, throughput, and cost spectrums. We also examined the consistency of performance and the impact of ONNX multithreading on boosting performance.

The full set of reports compared in this note are:

Ampere Altra (latency optimized): www.STACresearch.com/STAC221006a
Ampere Altra (throughput optimized): www.STACresearch.com/STAC221006b
Intel Ice Lake (latency optimized): www.STACresearch.com/STAC221007a
Intel Ice Lake (throughput optimized): www.STACresearch.com/STAC221007b
AMD Milan (latency optimized): www.STACresearch.com/STAC221008a
AMD Milan (throughput optimized): www.STACresearch.com/STAC221008b

The note and accompanying data tables (below) detail our findings.

Please log in to see file attachments. If you are not registered, you may register for no charge.

STAC Research Note: STAC-ML Markets (Inference) Processor Comparisons in Azure

User login