STAC Report: LMS ÜberNIC CXL with 10GbE and 25GbE under STAC-N1

New records from the first tests of a pure FPGA-based or CXL-based UDP stack.

27 June 2024

Liquid-Markets-Solutions (“LMS”) recently asked STAC to perform STAC-N1 Benchmark tests on a solution with their ÜberNIC firmware, software, and API on a BittWare FPGA board. This is the first test of an all FPGA-based UDP stack and the first test of a stack using CXL.

We tested two SUTs: one using the ÜberNIC firmware for 10GbE and the other for 25GbE. In addition, we report two message sizes for each. Standard STAC-N1 reports cover tests with 264-byte messages, while results for other message sizes remain in the STAC Vault. LMS asked STAC to make the 66-byte results from the audit public as well.

The four resulting STAC Reports are available here:

STAC-N1 measures the performance of a host network stack (server, OS, drivers, host adapter) using a market data style workload. The stack tested was two BittWare IA-440i Agilex FPGA cards with LMS’s ÜberNIC CXL firmware and user space kernel bypass libraries, each installed on a Supermicro X13SEI-F single socket motherboard (no chassis) with 32 GiB ECC RAM and a single 32-core Intel® Xeon® Gold 6558Q water cooled with an Alphacool Eiszeit 2000 and an Alphacool ES Jet LGA 4677 2U CPU cooler, running Red Hat Enterprise 9.3. Eight (8) of the 32 CPU cores were disabled to achieve a higher clock rate for the active cores. The two BittWare cards were connected by LC Multimode fiber cable and FS.com SFP+ transceivers. For the 25GbE SUT, forward error correction was turned off.

LMS wished to highlight the following results for SupplyToReceive latency using the standard 264-byte message sizes:

10GbE

  • Compared to all publicly disclosed STAC-N1 results to date on 10GbE systems that used UDP, this solution exhibited:
    • The lowest mean and 99th percentile and lowest maximum (in a tie) at the base rate of 100k msg/sec (STAC.N1.β1.PINGPONG.LAT1)
    • The lowest mean, 99th percentile, maximum, and standard deviation at the highest rate tested, 1 million msg/sec (STAC.N1.β1.PINGPONG.LAT2), across SUTs with the same or lower highest rate tested
25GbE
  • Compared to all publicly disclosed STAC-N1 results to date on 25GbE systems with Enterprise Class CPUs that used UDP, this solution exhibited:
    • The lowest maximum latency and a tie for the lowest mean, median, 99th percentile, and standard deviation at the base rate of 100k msg/sec (STAC.N1.β1.PINGPONG.LAT1)

Since the results for 66-byte messages are the first to go public, no public comparisons can be made. However, LMS wished to highlight the following results for SupplyToReceive latency using 66-byte message sizes:

10GbE

  • Mean of 1.9 µsec, 99th percentile of 2.0 µsec, and max of 7.9 µsec at the base rate of 100k msg/sec (STAC.N1.β1.PINGPONG.LAT1)
  • Mean of 1.9 µsec, 99th percentile of 2.1 µsec, and max of 6.2 µsec at the highest rate tested, 1 million msg/sec (STAC.N1.β1.PINGPONG.LAT2)
25GbE
  • Mean of 1.9 µsec, 99th percentile of 2.3 µsec, and max of 6.6 µsec at the base rate of 100k msg/sec (STAC.N1.β1.PINGPONG.LAT1)
  • Mean of 1.9 µsec, 99th percentile of 2.2 µsec, and max of 6.4 µsec at the highest rate tested, 1 million msg/sec (STAC.N1.β1.PINGPONG.LAT2)

The reports analyze latencies and other metrics in detail. Premium subscribers also have access to a detailed STAC Configuration Disclosure and a complete sosreport for each system.

If your firm does not have access to these materials, please take a minute to learn about subscription options.

About STAC News

Read the latest about research, events, and other important news from STAC.

Subscribe to notifications of research, events, and more.

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.

Enter your email above, then click "Sign Up" to join the STAC mail list and (optionally) register to access materials on the site. Click for terms.