STAC Summit, 13 Jun 2018, NYC
STAC Summits bring together industry leaders in architecture, app dev, infrastructure
engineering, and operational intelligence to discuss important technical challenges in
the finance industry. Come to hear leading ideas and exchange views with your peers.
WHERE
New York Marriott Marquis, 1535 Broadway, New York
Astor Ballroom
Agenda
Click on the session titles to view the slides and videos (may require member permissions).
 
Big Compute
Fast Compute
Big Data
Fast Data
 
Why is machine learning in finance so hard? | |
Nearly every industry is investing heavily to apply machine learning techniques to key problems, and finance is certainly no exception. However, there are aspects of trading and portfolio management that make machine learning uniquely challenging. If a firm tasks data scientists with using ML to produce alpha, and those data scientists are unaware of the issues unique to ML in finance, then there’s a risk that the alpha may not arrive or survive, and the firm will conclude that ML for finance is a dead end like neural networks in the 1990s. In this talk, Hardik will touch upon four reasons why machine learning is different in finance and will then discuss how to address the challenges. |
Making data science pay: Mastering the challenges of analytics operations | |
Many asset managers, hedge funds, brokers, and other financial firms are under pressure today to improve profitability through better application of technology to data. Some of them are looking to extract value from new kinds of information, while others are more focused on improving the yield from data they already have. But all of them are turning to data science and its most intriguing subset, machine learning, to analyze these datasets. However, many enterprises struggle to derive business value from these analytics, irrespective of the scale of their data science investment. Putting the models into production so that they feed either human or automated decision systems is typically slow, ad-hoc, and expensive. Once in place, all too often the data quality is poor, the operational support is inadequate, and the model success decays over time. As a former quant who learned the hard way how to build effective analytics operations, Michel believes that CTOs and Heads of Research need to treat data-driven analytics as an industrial process. This does not mean squashing the creativity out of data scientists. On the contrary, Michel will argue that the right technology frameworks and end-to-end processes can liberate the creative energy of data scientists while maximizing the value they deliver to the business. In this talk, he will make his case. |
 
Goodbye, Data Lake: Why continuous analytics yield higher ROI | |
Faced with the need to handle increasing volumes of data, alternative datasets ("alt data"), and AI, many financial firms are working to design or redesign their big data architectures. A traditional approach is to store everything in a data lake, process it via ETL, and analyze it in batch or interactive modes. However, in Yaron's view, this decade-old approach fails to generate sufficient ROI. In this talk, he will argue for a different approach in which information is ingested, enriched, and analyzed in context as it arrives, including via machine learning or Deep Learning, then immediately made available to users or to drive automated actions. He will also argue that it's possible to take full advantage of modern hardware and micro-services or serverless functions to achieve much higher performance while still benefitting from CI/CD, auto-scaling, and fast software rollouts. In Yaron's view, the resulting "continuous analytics" solutions yield faster answers for the business while remaining simpler and less expensive for IT. |
Scaling Python and PySpark using Vectorized UDFs and Apache Arrow | |
Scaling up data-intensive Python-based analytics in an efficient way continues to be very important for many financial firms. Using Spark and its Python interface (PySpark) is a popular way to scale today, but it suffers from inefficiencies. In particular, because Spark's runtime is implemented on top of a JVM, using PySpark with native Python libraries can hurt both performance and usability. Unsatisfied with this situation, Li and several other Spark contributors have implemented a new type of PySpark user-defined function (UDF) to solve this problem: Vectorized UDF. Vectorized UDF is built on top of Apache Arrow, a cross-language development platform for in-memory data. According to Li, Vectorized UDF brings the best of both worlds: high performance UDFs that are easy to use and scale up with Spark. In this talk, Li will explain how Vectorized UDFs work, how they perform, and what the open source roadmap looks like. |
Using FPGA for financial analytics: Has the programmability nut been cracked? | |
Field programmable gate arrays (FPGA) have long been used for ultra-low latency processing of network packets, such as parsing market data or sending trade-execution instructions. But the massive parallelism, high power efficiency, and growing memory capacity of FPGA technology have also held out promise for more compute-intensive workloads such as risk management, backtesting of complex trading strategies, and artificial intelligence. The main obstacle for FPGA in these areas has been development: programming the hardware has been a slow process requiring highly specialized skills. The FPGA ecosystem has tried many ways to make FPGA acceleration accessible to traditional software developers, but none has caught on to date. Is that about to change? In this panel, we will review the business requirements that FPGA solutions must meet, the latest ways to enable software developers to offload analytics to FPGA accelerators, and some of the key design considerations to get maximum performance from such applications. |
The STAC Cloud SIG | |
Increasing the use of public, private, or hybrid clouds is high on the agendas of many financial firms. However, when making cloud decisions, these firms face a number of questions and obstacles in areas like security, price-performance, and functionality. The new STAC Cloud Special Interest Group (SIG) is a group of financial firms and vendors that has set out to standardize methods of assessing cloud solutions, facilitate dialog and best practices, and guide a testing program. Peter will explain what it’s all about. |
STAC Update: Big Workloads | |
Peter will discuss the latest benchmark results involving big workloads such as tick analytics and backtesting and discuss new benchmarks that combine big data with big compute. |
 
How to make best use of leading non-volatile memory technologies | |
The non-volatile memory landscape has changed dramatically from a few years ago, with new offerings spreading out to occupy very different points along the axes of density, performance, and cost. What are the best ways to use these new offerings to meet business objectives? In this talk, Shirish will discuss what we can learn about the answers these questions from test results and use cases in the field. |
Tackling the Challenges of Market Simulation | |
Testing trading algorithms before deploying them to the wild is one of the most important tasks facing a trading firm, yet it is also one of the most challenging. The firm needs to be confident that the algo will be profitable (or at least not rapidly unprofitable), while trading venues and regulators demand that it not disrupt market stability. But no matter the goal—testing edge cases, regression and stability, or P&L—simulating how an algorithm will behave introduces challenges that include time synchronization, interacting with historical trading days, and correcting for market impact. In this talk, Mark will review the need for accurate market simulation in a number of use cases, discuss the challenges in producing a simulation that accurately predicts what a trading strategy would experience on a given trading day, and provide a point of view on how to overcome those challenges. |
STAC Update: Time Sync | |
Peter will provide the latest information regarding STAC-TS tools and research in the area of time synchronization, timestamping, event capture--including software tools to demonstrate compliance with time-sync regulations. |
 
Monitoring trading in an increasingly challenging environment | |
As automated and other electronic trading has grown, so has the challenge of monitoring the trading systems. On the one hand, firms now expect tools to extract as much business-level insight as possible from their monitoring data. On the other hand, things have gotten tougher at the infrastructure level. Competitive and regulatory forces have upped the performance required from the monitoring system while putting downward pressure on the cost per monitoring point. And as the infrastructure beneath trade flows becomes more fluid (think dynamically configured networks, private clouds, or even public clouds), the monitoring systems have to adapt. We’ll ask some experts for their perspective on the state of the monitoring art and what firms can do to stay ahead. |
STAC Update: Fast Data | |
Peter will discuss the latest research and Council activities related to low-latency/high-throughput realtime workloads. |
 
The Big MAC Mystery: What is a MAC and how do you measure it? | |
One of the most interesting recent developments in the latency race has been the growth of an end-user market for Medium Access Controller (MAC) products. For most of the history of networks, the MAC was a layer of functionality buried deep in network devices, far from the concern or scrutiny of the application developer. However, as more trading firms move their trading logic from software into FPGA-powered network hardware, a number of vendors have begun to expose their MAC logic as FPGA IP cores for sale. This has led to a problem that is common in nascent markets: significant confusion around product definition, differentiation, and performance claims. That is, vendors are offering different functionality under the MAC banner, accompanied by significantly different performance claims. In this talk, Matthew will propose a precise definition of the minimum feature set of a 10Gb/s Ethernet MAC, along with a range of potential methodologies to accurately and consistently measure MAC latency. |
 
Why your transatlantic trades are getting picked off | |
While enjoying his gardening leave from a Chicago HFT shop, Bob Van Valzah sometimes likes to go on long bike rides in the Chicago suburbs. During one of these trips, he recently discovered a mysterious radio tower in an industrial park, which led him to do some detective work. He found that that tower linked markets at CME directly to markets in Europe via shortwave radio. He went on to find two other sites around Chicago with the same function. Traders using microwave radio is old news, but these are the first documented cases of traders using shortwave radio to cross oceans. Bob will also survey sites where shortwave licenses have been issued near Mahwah and on Long Island. In this talk, Bob will offer up the photos, filings, and other evidence so you can see for yourself. |
About STAC Events & Meetings
STAC events bring together CTOs and other industry leaders responsible for solution architecture, infrastructure engineering, application development, machine learning/deep learning engineering, data engineering, and operational intelligence to discuss important technical challenges in finance.
Event Resources
Speakers
Hardik Patelqplum Li JinTwo Sigma Investments
Eric PowersDeutsche Bank Bob Van ValzahGarden leave
Michel DebicheSTAC Jeremy EderRed Hat
Mutema PittmanIntel Mark SkalabrinRedline Trading Solutions
Dr. Matthew GrosvenorExablaze Dr. David SnowdonMetamako
Shirish BhargavaIntel Matt MeinelLevyx
Steve ColwillVelocimetrics John D. Davis, Ph. D.Bigstream
Yaron Haviviguazio John LockwoodAlgo-Logic
Dave WeberLenovo Ron HerrmannE8 Storage
Zahid HussainVexata Edouard AlligandQuasarDB
Hollis BeallX-IO Technologies Laurent de BarryEnyx
Davor FrankSolarflare Pritam KandelOrolia
Björn KolbeckQuobyte Tom LeahyEndace
Cliff MaddoxNovaSparks Vahan SardaryanLDA Technologies