STAC-ML™ Markets (Inference) Naive Implementation with ONNX on an Azure E104is v5 VM (104 Intel® Xeon® Platinum 8370C vCPUs, 672 GiB memory) Latency-Optimized Configuration

STAC-ML™ Markets (Inference) Benchmarks (Sumaco suite)

  • STAC-ML Markets (Inference) Naive Implementation (Compatibility Rev B)
  • Driver and Inference Engine
    • Python 3.8.10
    • ONNX runtime 1.12.1
    • NumPy 1.23.3
  • Ubuntu Linux 20.04.5 LTS
    • Based on a standard image provided by Microsoft® Azure
    • No OS tuning performed
  • A Microsoft® Azure Standard E104is v5 VM
    • Isolated Instance – No other VMs on the system
    • 104 Intel® Xeon® Platinum 8370C (Ice Lake) vCPUs @ 2.8GHz
    • 672 GiB of memory
    • 256 GiB Premium SSD LRS

Though no vendors had a hand in optimizing the system's performance, one vendor did help make the project happen: Microsoft provided credits in Azure so that this research could be completed. We are grateful for their help.

This report is just one in a series that explores latency and throughput optimization of ML inference workloads across different processor architectures in Microsoft Azure, all under similar software stacks. Together, these STAC Reports illustrate the kinds of insights STAC-ML benchmarks can provide while underscoring the sensitivity of performance results to the objectives of the solution architect.

The full set of reports in this series also includes:

A research note that makes comparisons between different the different SUTs and explores some of their differences will be available at www.STACresearch.com/ml soon.

Please log in to see file attachments. If you are not registered, you may register for no charge.

The use of machine learning (ML) to develop models is now commonplace in trading and investment. Whether the business imperative is reducing time to market for new algorithms, improving model quality, or reducing costs, financial firms have to offload major aspects of model development to machines in order to continue competing in the markets.