STAC Report: Spark with Levyx Xenon under STAC-A3

32.7x the throughput of a larger Hadoop streaming cluster.

26 June 2017

STAC recently performed STAC-A3 Benchmark tests on a stack consisting of Apache Spark 1.6.1 with Levyx Xenon 3.2.0 on 5 x Google Cloud Platform n1-standard-64 nodes with 3TB local SSD each. The STAC Pack (benchmark implementation code) for this STAC-A3 project was initially authored by Cloudera and Intel and then enhanced by Levyx, who moved it to the Spark Dataframes API and optimized the data layout.

The report is available here. The configuration details, implementation code, and test-harness software are available to firms with premium subscriptions.

STAC-A3 simulates workloads common in the refinement and backtesting of trading strategies. These are rate-limiting steps in a firm's response to changing market conditions, so the performance of backtesting infrastructure has a top-line impact. Several trading firms drove the requirements for STAC-A3 in order to facilitate software and hardware comparisons. Like other STAC Benchmarks, STAC-A3 is agnostic to architecture.

The STAC Report contains dozens of results. Levyx wished to highlight the following:

  • Compared to a prior implementation using Hadoop Streaming on a 14-node bare-metal cluster (SUT ID INTC151220-VI), this solution with 5 nodes had 32.7x the throughput in the SWEEP benchmark (STACA3.β1.SWEEP.SPEED).

This report also contains a proposed but unofficial analysis of the overall price-performance of the solution with respect to cloud costs.

STAC-A3 work is ongoing. If you'd like to be involved, please let us know at the STAC Backesting SIG site.

For information on premium subscriptions, please contact us.