STAC Summit, 13 Nov 2019, London

STAC Summits

STAC Summits bring together CTOs and other industry leaders responsible for solution architecture, infrastructure engineering, application development, machine learning/deep learning engineering, data engineering, and operational intelligence to discuss important technical challenges in trading and investment.

Come to hear leading ideas and exchange views with your peers.


WHERE
Leonardo Royal Hotel
London City
8-14 Coopers Row, London, EC3N 2BQ

Agenda

Click on the session titles to view the slides.

 

Big Compute Big Compute

Fast Compute Fast Compute

Big Data Big Data

Fast Data Fast Data

 


STAC Update: Big computeBig Compute   
 

Michel will discuss the latest research and activities in compute-intensive workloads such as deep learning and derivatives risk.

Enabling value extraction from limit orderbook dataBig Data   Big Compute   
 

Applying Machine Learning/Deep Learning techniques to the highly complex, non-linear, and pattern-rich world of financial market microstructure is a significant engineering challenge. One problem is that the data is vast and fast-flowing. Collecting and processing the data into a large number of features to feed the algorithms—not just once, but nearly continuously—is heavy lifting. And continuous retraining of ML and (especially) DL models requires massive, parallelized compute resources. In this talk, Hugh will provide lessons on tackling these challenges that BMLL has learned from building a third-party research platform focused on global limit order book data. Starting with feature engineering examples from futures data, Hugh will articulate the philosophy behind BMLL’s architectural approach, as well as the key elements of a highly scalable processing and analytics pipeline that leverages Apache projects in the cloud to enable AI on order book data.

Innovation RoundupBig Data   Big Compute   
  “FPGAs accelerating AI for financial services”
    Ronak Shah, Director, FPGA’s AI & Acceleration Marketing Strategy, Intel
  "Simplifying Deep Learning Infrastructure with Dell EMC"
   
Boni Bruno, Chief Solutions Architect, Dell EMC
  "Accelerating Applications with the Xilinx Quantitative Finance Library"
    John Courtney, Product Specialist, Xilinx
  "No More Tiers: Radical Flash Savings to Redefine AI and Market Data Storage Infrastructure"
    Jeff Denworth, VP Products and Co-Founder, VAST Data

 

"ML Oops": How data simulation can help your quants avoid modeling errorsBig Data   Big Compute   
 

Today's abundant tools make it possible to generate new ML and DL models for trading and investment in record time. Supply the tools with a dataset, an objective function, and quality metrics, and before long they will dutifully respond with the best result gleaned from millions of iterations. The trouble is, these modeling techniques can lead to flawed outcomes, such as models biased to a specific dataset or "superstitious" models that rely on hidden or erroneous assumptions. Moreover, the flaws may go unnoticed because many models are inherently hard to interpret or explain. In the best case, such modeling errors result in unnecessary re-work downstream in the model development (or ML Ops) pipeline. In the worst case, they make it into production and are discovered the hard way. So detecting these errors early in the ML Ops pipeline is desirable. In this talk, Michel will argue that data simulators can be useful tools for such detection. By controlling characteristics of training and test data that are uncontrolled in real market data (where ground truth is unknown), simulators allow the user to expose vulnerabilities in modeling techniques that may otherwise remain obscured. Through examples that use the prototype STAC AI data generator for benchmarks on deep time series data, Michel will show how data simulation can test fidelity of models to known inputs, test model interpretability, and help avoid costly mistakes.

Why a single C++ API makes sense for heterogeneous compute infrastructureFast DataBig Data   Fast Compute   Big Compute   
 

The future of computing in finance certainly seems heterogeneous. It’s a fair bet that in the coming years, optimizing the latency, throughput, and cost efficiency of a given workload will increasingly require some combination of scalar (CPU), vector (GPU), matrix (AI) and spatial (FPGA) processors. These architectures require an efficient software programming model to deliver performance. As we often discuss at STAC, high-level languages like Python or frameworks like Spark make it relatively easy to deal with this diversity, since they allow for highly optimized platform-specific libraries under the covers. But what about programs written in C++? Many performance-obsessed programmers prefer C++ because it provides the greatest exposure to the capabilities of underlying hardware. With that exposure, however, comes a requirement to code to the specifics of the hardware, making coding difficult and non-portable. Furthermore, attempts to program FPGAs in C++ have historically suffered in terms of performance. In short, no one has yet come up with a market-winning answer to the tension between performance, portability, and ease of use. However, as a provider of all of the processor types above, Intel has developed a point of view on the best approach to these challenges. Graham will articulate that point of view as well as outline how Intel is putting it into practice through its OneAPI initiative (including architecture, tooling, and development status).

STAC Update: Big dataBig Data   
 

Michel will discuss the latest research and activities in data-intensive workloads such as tick analytics and backtesting.

Innovation RoundupBig Data   
  "NVMe-oF for High Frequency Trading"
    VR Satish, CTO, Pavilion Data
  "New optimization strategies for in-memory analytics using Optane persistent memory"
    Glenn Wright, Systems Architect, Kx Systems
  "Enabling Low Latency Market Data Applications with Storage Class Memory"
    Charles Fan, Co-founder and CEO, MemVerge
  "Simpler historical updates management"
    Benjamin Filippi, Chief Product Officer, QuasarDB
  "Scale-in Software for Capital Markets Computing"
    Matt Meinel, SVP of Sales, Business Development and Solutions Architecture, Levyx

 

Drinking from the firehose: streaming ingest benchmarksBig Data   Fast Data   
 

Most of the fast data that flows through a financial organization winds up as big data. That is, it's captured in a database somewhere for analysis, either immediately or later. But the process of ingesting high-volume streaming data and making it available through visualizations or query interfaces is challenging and getting more so. This session will examine empirical data from two examples in this problem domain. First, Peter will present a benchmarking project on a visualization system designed specifically for realtime streaming data. Then he and Edouard will present a proof of concept of database ingest tests using event-driven datastreams, which will be proposed for consideration by the STAC-M3 Working Group.

Democratizing time sync to level the playing fieldFast Data   
 

How can exchanges ensure fairer execution? How can they improve the simultaneity of market data receipt? How can liquidity takers reduce what they give up on multi-venue trades to market makers with faster pipes? And how can any of this be done without huge investments in infrastructure? According to Dan, the answer to all these questions starts in one place: highly accurate software-based time synchronization. He claims that accurate time sync deployed at scale can transform an unpredictable market into a nearly perfect FIFO machine, even if that market is built upon extremely jittery infrastructure. In this talk, Dan will back up his claim with a demonstration. By activating time sync in a simulated market running across several dozen (low-end) public cloud VMs, he will attempt to show that the market behaves as if it had deterministic and equal latency throughout. How well will he do? Come to find out and debate the implications.

Why accuracy-driven markets will transform tradingFast Data   
 

As we've discussed many times at STAC, liquid markets have been in a positive-feedback loop between determinism and latency for the last several years. In an effort to improve fairness, exchanges and other trading venues have become more deterministic, increasing the likelihood that orders that arrive first are executed first. Reducing this uncertainty for trading firms has increased the return those firms can get from reducing their latencies by small increments. The more these firms reduce their tick-to-trade latencies, the larger the impact that small uncertainties in trading venue processing can have on fairness, hence the more pressure venues feel to improve determinism even further. According to Dave, some venues see a way out of this vicious cycle: much more accurately determining which orders arrived first. To the extent this is possible (and Dave will present evidence that it is), venue applications can use these arrival times to determine execution order, thus relaxing the need for deterministic infrastructure. But what will this imply for trading firms? Will this simply replace the determinism-latency cycle with an accuracy-latency cycle that is perhaps even more demanding? Come to hear Dave’s view and join in the discussion.

Innovation RoundupFast Data   
  "In production: better than 100 nanos accuracy with NTP and PTP, fault tolerance, and forensic traceability."
    Nino De Falcis, EVP, Sales & Marketing, FSMTime by FSMLabs
  "Introducing GEARS, the Secured Galileo Time Server"
    Jean-Arnold Chenilleau, Program Manager, Orolia
  "Improving agility and security using Mellanox Intelligent-NICs & Smart-NICs"
    Mark Taplin, Ethernet Sales UK&I, Mellanox
  "Low-Latency Optical Transmission: AccuCore HCF™"
    Dr. Daryl Inniss, Director, OFS Fitel

 

STAC Update: Fast DataFast Data
 

Peter will discuss the latest research and activities in latency-sensitive workloads such as tick-to-trade processing.

Innovation RoundupFast Data   Fast Compute   
  "Exegy Xero – Setting a New Benchmark for Tick-to-Trade Speed"
    David Taylor, Chief Technology Officer, Exegy
  “Market Consolidation and ETF Calculators”
    Cliff Maddox, Director of Sales, NovaSparks
  "Cracking the code on FPGA: How Enyx is making hardware performance more accessible."
    Laurent de Barry, Founder & Managing Director, Enyx
  "Do you really know what happens inside your FPGA?"
    Frederic Leens, CEO, Exostiv Labs

 

How hard could it be? Understanding network traffic at the picosecond level Fast Data   
 

The proliferation of double-digit nanosecond (FPGA based) trading systems is forcing firms to measure things at finer and finer accuracies. Several vendors now offer sub-nanosecond or “picosecond-scale” network measurement technologies. Firms that make use of such technologies need to consider what other changes, if any, they need to make to their measurement infrastructure as a result. Is it feasible to simply “drop in” picosecond scale network measurements, or are fundamental changes in thinking required? In this session, Matthew will offer theoretical and practical viewpoints on the implications of picosecond-scale network measurement techniques. To illustrate these, he will refer to Exablaze’s work with STAC to “upgrade” certain STAC benchmarks to accuracies better than a nanosecond.