STAC Report: Intel GPUs under STAC-A2 (derivatives risk)
SUT with 4 Intel GPUs in Dell liquid-cooled server sets records.
18 October 2023
STAC recently performed the first STAC-A2 Benchmark tests on Intel’s Data Center GPUs. This is also the first liquid-cooled system with publicly disclosed STAC-A2 audit results. The stack featured 4 x Intel® Data Center GPU Max 1550 accelerators and included 2 x Intel® Xeon® Platinum 8468 processors at 2.1 GHz in a Dell PowerEdge XE9640 system with 32 GiB of memory and ran Ubuntu Linux 22.04.3 LTS. The server had patches applied to mitigate Spectre & Meltdown security vulnerabilities.
STAC-A2 is the technology benchmark standard based on financial market risk analysis. Designed by quants and technologists from some of the world's largest banks, STAC-A2 reports the performance, scaling, quality, and resource efficiency of any technology stack that is able to handle the workload (Monte Carlo estimation of Heston-based Greeks for a path-dependent, multi-asset option with early exercise).
Intel wished to highlight several results from this report:
- Compared to all publicly reported solutions to date, this solution set numerous performance and efficiency records, including (but not limited to):
- The fastest warm1 (0.405 s) and cold2 (1.09 s) times in the large problem size benchmarks
- A space efficiency3 (238 options / hour / cu. In.) 2.3x better than the previous best result
- The best energy efficiency4 (314,493 options / kWh), 1.0% better than the previous record
- Compared to a system using 8 x GPUs (SUT ID NVDA230721), this solution delivered:
- 78% of the throughput5
- 98% of the speed in warm runs in the baseline problem size benchmark6
- 1.7x the speed in cold runs of the large problem size benchmark2
- 1.2x the speed in warm runs of the large problem size benchmark1
- 4.3x the space efficiency3
- Compared to a solution featuring 2 x Intel® Xeon® Platinum 8480+ (Sapphire Rapids) processors and the previous Intel STAC Pack (SUT ID INTC230524), this solution delivered:
- 4.0x and 2.0x better performance in warm6 and cold7 runs (respectively) of the baseline problem size benchmarks
- 6.9x and 4.4x better performance in warm1 and cold2 runs (respectively) of the large problem size benchmarks
- 7.9x better throughput5
- 7.0x improvement in space efficiency3 and 2.7x improvement in energy efficiency4
For details, please see the report at the link above. Premium subscribers have access to the code used in this project as well as the micro-detailed configuration information for the solution. To learn about subscription options, please contact us.
About STAC News
Read the latest about research, events, and other important news from STAC.